aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4/tcp_timer.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge 3.18.100 into android-msm-marlin-3.18Thierry Strudel2018-03-201-0/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.18.100 fixup: sctp: verify size of a new chunk in _sctp_make_chunk() serial: 8250_pci: Add Brainboxes UC-260 4 port serial device usb: usbmon: Read text within supplied buffer size USB: usbmon: remove assignment from IS_ERR argument * usb: quirks: add control message delay for 1b1c:1b20 * staging: android: ashmem: Fix lockdep issue during llseek uas: fix comparison for error code tty/serial: atmel: add new version check for usart serial: sh-sci: prevent lockup on full TTY buffers x86: Treat R_X86_64_PLT32 as R_X86_64_PC32 x86/module: Detect and skip invalid relocations scripts: recordmcount: break hardlinks ubi: Fix race condition between ubi volume creation and udev netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt netfilter: bridge: ebt_among: add missing match size checks * netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets * netfilter: IDLETIMER: be syzkaller friendly * netfilter: nat: cope with negative port range netfilter: x_tables: fix missing timer initialization in xt_LED ALSA: seq: More protection for concurrent write and ioctl races ALSA: seq: Don't allow resizing pool in use x86/MCE: Serialize sysfs changes Input: matrix_keypad - fix race when disabling interrupts MIPS: BMIPS: Do not mask IPIs during suspend scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS Linux 3.18.99 * dm io: fix duplicate bio completion due to missing ref count * fib_semantics: Don't match route with mismatching tclassid * net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68 sctp: verify size of a new chunk in _sctp_make_chunk() s390/qeth: fix IPA command submission race s390/qeth: fix SETIP command handling sctp: fix dst refcnt leak in sctp_v6_get_dst() * udplite: fix partial checksum initialization * ppp: prevent unregistered channels from connecting to PPP units * netlink: ensure to loop over all netns in genlmsg_multicast_allns() * net: fix race on decreasing number of TX queues * ipv6 sit: work around bogus gcc-8 -Wrestrict warning hdlc_ppp: carrier detect ok, don't turn off negotiation * bridge: check brport attr show in brport_show * leds: do not overflow sysfs buffer in led_trigger_show net: fec: introduce fec_ptp_stop and use in probe fail path ARM: mvebu: Fix broken PL310_ERRATA_753970 selects cpufreq: s3c24xx: Fix broken s3c_cpufreq_init() * ALSA: usb-audio: Add a quirck for B&W PX headphones tpm_i2c_nuvoton: fix potential buffer overruns caused by bit glitches on the bus tpm_i2c_infineon: fix potential buffer overruns caused by bit glitches on the bus Linux 3.18.98 net: gianfar_ptp: move set_fipers() to spinlock protecting area sctp: make use of pre-calculated len xen/gntdev: Fix partial gntdev_mmap() cleanup xen/gntdev: Fix off-by-one error when unmapping with holes SolutionEngine771x: fix Ether platform data mdio-sun4i: Fix a memory leak xen-netfront: enable device after manual module load drm/ttm: check the return value of kzalloc e1000: fix disabling already-disabled warning xfs: quota: check result of register_shrinker() xfs: quota: fix missed destroy of qi_tree_lock s390/dasd: fix wrongly assigned configuration data * led: core: Fix brightness setting when setting delay_off=0 bnx2x: Improve reliability in case of nested PCI errors tg3: Enable PHY reset in MTU change path for 5720 tg3: Add workaround to restrict 5762 MRRS to 2048 scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error net: arc_emac: fix arc_emac_rx() error paths spi: atmel: fixed spin_lock usage inside atmel_spi_remove * sget(): handle failures of register_shrinker() * ipv6: icmp6: Allow icmp messages to be looped back mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM * hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) * ipv6: Skip XFRM lookup if dst_entry in socket cache is valid Linux 3.18.97 * ASN.1: fix out-of-bounds read when parsing indefinite length item * usb: gadget: f_fs: Process all descriptors during bind * usb: dwc3: gadget: Set maxpacket size for ep0 IN * arm64: Disable unhandled signal log messages by default * irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq() x86/oprofile: Fix bogus GCC-8 warning in nmi_setup() iio: adis_lib: Initialize trigger before requesting interrupt * iio: buffer: check if a buffer has been set up when poll is called cfg80211: fix cfg80211_beacon_dup scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info PCI: keystone: Fix interrupt-controller-node lookup * netfilter: drop outermost socket lock in getsockopt() Linux 3.18.96 crypto: s5p-sss - Fix kernel Oops in AES-ECB mode KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously hippi: Fix a Fix a possible sleep-in-atomic bug in rr_close * xen: XEN_ACPI_PROCESSOR is Dom0-only x86/mm/kmmio: Fix mmiotrace for page unaligned addresses * mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep dmaengine: jz4740: disable/unprepare clk if probe fails * xfrm: Fix stack-out-of-bounds with misconfigured transport mode policies. spi: sun4i: disable clocks in the remove function * 509: fix printing uninitialized stack memory when OID is empty btrfs: Fix possible off-by-one in btrfs_search_path_in_tree net_sched: red: Avoid illegal values net_sched: red: Avoid devision by zero gianfar: fix a flooded alignment reports because of padding issue. s390/dasd: prevent prefix I/O error powerpc/perf: Fix oops when grouping different pmu events scripts/kernel-doc: Don't fail with status != 0 if error encountered with -none media: s5k6aa: describe some function parameters perf bench numa: Fixup discontiguous/sparse numa nodes perf top: Fix window dimensions change handling ARM: dts: am4372: Correct the interrupts_properties of McASP ARM: AM33xx: PRM: Remove am33xx_pwrdm_read_prev_pwrst function * usb: build drivers/usb/common/ when USB_SUPPORT is set usbip: keep usbip_device sockfd state in sync with tcp_socket dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock video: fbdev/mmp: add MODULE_LICENSE ASoC: ux500: add MODULE_LICENSE tag * selinux: ensure the context is NUL terminated in security_context_to_sid_core() * Provide a function to create a NUL-terminated string from unterminated data * net: avoid skb_warn_bad_offload on IS_ERR netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insert * netfilter: on sockopt() acquire sock lock only in the required scope netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check() * netfilter: x_tables: avoid out-of-bounds reads in xt_request_find_{match|target} * netfilter: x_tables: fix int overflow in xt_alloc_table_info() crypto: x86/twofish-3way - Fix %rbp usage * selinux: skip bounded transition processing if the policy isn't loaded * xfrm: check id proto in validate_tmpl() * mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed. media: r820t: fix r820t_write_reg for KASAN ARM: dts: s5pv210: add interrupt-parent for ohci ALSA: seq: Fix racy pool initializations Btrfs: fix crash due to not cleaning up tree log block's dirty bits Btrfs: fix deadlock in run_delalloc_nocow console/dummy: leave .con_font_get set to NULL video: fbdev: atmel_lcdfb: fix display-timings lookup ext4: correct documentation for grpid mount option * ext4: save error to disk in __ext4_grp_locked_error() drm/radeon: adjust tested variable ALSA: seq: Fix regression by incorrect ioctl_mutex usages arm: spear13xx: Fix spics gpio controller's warning arm: spear13xx: Fix dmas cells arm: spear600: Add missing interrupt-parent of rtc s390: fix handling of -1 in set{,fs}[gu]id16 syscalls * PM / devfreq: Propagate error from devfreq_add_device() IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports Linux 3.18.95 mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copy ACPI: sbshc: remove raw pointer from printk() message pktcdvd: Fix pkt_setup_dev() error path EDAC, octeon: Fix an uninitialized variable warning xtensa: fix futex_atomic_cmpxchg_inatomic alpha: fix reboot on Avanti platform alpha: fix crash if pthread_create races with signal delivery signal/sh: Ensure si_signo is initialized in do_divide_error signal/openrisc: Fix do_unaligned_access to send the proper signal * kernel/async.c: revert "async: simplify lowest_in_progress()" media: cxusb, dib0700: ignore XC2028_I2C_FLUSH crypto: caam - fix endless loop when DECO acquire fails * crypto: cryptd - pass through absence of ->setkey() * crypto: hash - introduce crypto_hash_alg_has_setkey() * kernfs: fix regression in kernfs_fop_write caused by wrong type NFS: commit direct writes even if they fail partially NFS: Add a cond_resched() to nfs_commit_release_pages() mtd: nand: Fix nand_do_read_oob() return value media: dvb-usb-v2: lmedm04: move ts2020 attach to dm04_lme2510_tuner media: dvb-usb-v2: lmedm04: Improve logic checking of warm start dccp: CVE-2017-8824: use-after-free in DCCP code usbip: vhci: stop printing kernel pointer addresses in messages usbip: stub: stop printing kernel pointer addresses in messages usbip: prevent leaking socket pointer address in messages usbip: vhci-hcd: Add USB3 SuperSpeed support usb: usbip: Fix possible deadlocks reported by lockdep usbip: Fix potential format overflow in userspace tools usbip: prevent vhci_hcd driver from leaking a socket pointer address usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input usbip: fix stub_rx: get_pipe() to validate endpoint number * posix-timer: Properly check sigevent->sigev_notify CIFS: zero sensitive data when freeing cifs: Fix autonegotiate security settings mismatch cifs: Fix missing put_xid in cifs_file_strict_mmap * ipv4: Map neigh lookup keys in __ipv4_neigh_lookup_noref() * KEYS: encrypted: fix buffer overread in valid_master_desc() ARM: exynos_defconfig: Enable NFSv4 client ARM: exynos_defconfig: Enable options to mount a rootfs via NFS * tcp: release sk_frag.page in tcp_disconnect r8169: fix RTL8168EP take too long to complete driver initialization. qlcnic: fix deadlock bug * net: igmp: add a missing rcu locking section ip6mr: fix stale iterator vhost_net: stop device during reset owner Linux 3.18.94 um: Fix out-of-tree build ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE spi: imx: do not access registers while clocks disabled * selinux: general protection fault in sock_has_perm usb: uas: unconditionally bring back host after reset * usb: f_fs: Prevent gadget unbind if it is already unbound * USB: serial: simple: add Motorola Tetra driver usbip: list: don't list devices attached to vhci_hcd usbip: prevent bind loops on devices attached to vhci_hcd USB: serial: io_edgeport: fix possible sleep-in-atomic CDC-ACM: apply quirk for card reader USB: cdc-acm: Do not log urb submission errors on disconnect * USB: serial: pl2303: new device id for Chilitag staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID * usb: gadget: don't dereference g until after it has been null checked media: usbtv: add a new usbid * scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg * quota: Check for register_shrinker() failure. * net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit hwmon: (pmbus) Use 64bit math for DIRECT format values nfsd: check for use of the closed special stateid nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0) xen-netfront: remove warning when unloading module KVM: VMX: Fix rflags cache during vCPU reset mac80211: fix the update of path metric for RANN frame bcache: check return value of register_shrinker KVM: X86: Fix operand/address-size during instruction decoding KVM: x86: Don't re-execute instruction when not passing CR2 value KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure igb: Free IRQs when device is hotplugged gpio: iop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE ALSA: seq: Make ioctls race-free * loop: fix concurrent lo_open/lo_release um: Remove copy&paste code from init.h um: Stop abusing __KERNEL__ um: link vmlinux with -no-pie * Input: do not emit unneeded EV_SYN when suspending Linux 3.18.93 * hrtimer: Reset hrtimer cpu base proper on CPU hotplug * ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY * ipv6: fix udpv6 sendmsg crash caused by too small MTU * net: Allow neigh contructor functions ability to modify the primary_key vmxnet3: repair memory leak sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf sctp: do not allow the v4 socket to bind a v4mapped v6 address * pppoe: take ->needed_headroom of lower device into account on xmit * net: qdisc_pkt_len_init() should be more robust * tcp: __tcp_hdrlen() helper * net: igmp: fix source address check for IGMPv3 reports dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state * net: tcp: close sock if net namespace is exiting x86/microcode/intel: Extend BDW late-loading further with LLC size check * eventpoll.h: add missing epoll event masks scsi: libiscsi: fix shifting of DID_REQUEUE host byte * fs/fcntl: f_setown, avoid undefined behaviour reiserfs: don't preallocate blocks for extended attributes reiserfs: fix race in prealloc discard netfilter: xt_osf: Add missing permission checks netfilter: nfnetlink_cthelper: Add missing permission checks netfilter: nf_conntrack_sip: extend request line validation * netfilter: restart search if moved to other chain * netfilter: nf_ct_expect: remove the redundant slash when policy name is empty ipc: msg, make msgrcv work with LONG_MIN hwpoison, memcg: forcibly uncharge LRU pages * mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once usbip: Fix implicit fallthrough warning x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels MIPS: AR7: ensure the port type's FCR value is used arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6 dm btree: fix serious bug in btree_split_beneath() ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7 * phy: work around 'phys' references to usb-nop-xceiv devices Input: twl4030-vibra - fix sibling-node lookup Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning Input: twl6040-vibra - fix child-node lookup Input: twl6040-vibra - fix DT node memory management Input: 88pm860x-ts - fix child-node lookup * pipe: avoid round_pipe_size() nr_pages overflow on 32-bit * af_key: fix buffer overread in parse_exthdrs() * af_key: fix buffer overread in verify_address_len() ALSA: hda - Apply the existing quirk to iMac 14,1 * ALSA: pcm: Remove yet superfluous WARN_ON() * futex: Prevent overflow by strengthen input validation * scsi: sg: disable SET_FORCE_LOW_DMA * gcov: disable for COMPILE_TEST Linux 3.18.92 e1000e: Fix e1000_check_for_copper_link_ich8lan return value. uas: ignore UAS for Norelsys NS1068(X) chips * Bluetooth: Prevent stack info leak from the EFS element. * staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl usbip: remove kernel addresses from usb device and urb debug msgs USB: fix usbmon BUG trigger usb: misc: usb3503: make sure reset is low for at least 100us USB: serial: cp210x: add new device ID ELV ALC 8xxx USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ Revert "can: kvaser_usb: free buf in error paths" target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK iscsi-target: Make TASK_REASSIGN use proper se_cmd->cmd_kref x86/microcode/intel: Extend BDW late-loading with a revision check * crypto: algapi - fix NULL dereference in crypto_remove_spawns() * net: stmmac: enable EEE in MII, GMII or RGMII only sh_eth: fix SH7757 GEther initialization sh_eth: fix TSU resource handling RDS: null pointer dereference in rds_atomic_free_op RDS: Heap OOB write in rds_message_alloc_sgs() 8021q: fix a memory leak for VLAN 0 device x86/acpi: Reduce code duplication in mp_override_legacy_irq() ALSA: aloop: Fix racy hw constraints adjustment ALSA: aloop: Fix inconsistent format due to incomplete rule ALSA: aloop: Release cable upon open error path ALSA: pcm: Allow aborting mutex lock at OSS read/write loops ALSA: pcm: Abort properly at pending signal in OSS read/write loops ALSA: pcm: Add missing error checks in OSS emulation plugin builder * ALSA: pcm: Remove incorrect snd_BUG_ON() usages x86/acpi: Handle SCI interrupts above legacy space gracefully kvm: vmx: Scrub hardware GPRs at VM-exit * perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA MIPS: Consistently handle buffer counter with PTRACE_SETREGSET MIPS: Guard against any partial write attempt with PTRACE_SETREGSET MIPS: Factor out NT_PRFPREG regset access helpers IB/srpt: Disable RDMA access by the initiator can: gs_usb: fix return value of the "set_bittiming" callback Input: elantech - add new icbody type 15 * kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal() * kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals * kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL fscache: Fix the default for fscache_maybe_release_page() crypto: n2 - cure use after free kernel/acct.c: fix the acct->needcheck check in check_free_space() Linux 3.18.91 * n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD) * usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 * usb: add RESET_RESUME for ELSA MicroLink 56K * usb: Add device quirk for Logitech HD Pro Webcam C925e USB: serial: option: add support for Telit ME910 PID 0x1101 * net: ipv4: fix for a race condition in raw_sendmsg sctp: Replace use of sockets_allocated with specified macro. net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case tg3: Fix rx hang on MTU change with 5717/5719 * tcp md5sig: Use skb's saddr when replying to an incoming segment net: qmi_wwan: add Sierra EM7565 1199:9091 * netlink: Add netns check on taps * net: igmp: Use correct source address on IGMPv3 reports * ipv6: mcast: better catch silly mtu values * ipv4: igmp: guard against silly MTU values * kbuild: add '-fno-stack-check' to kernel build options ASoC: twl4030: fix child-node lookup * ring-buffer: Mask out the info bits when returning buffer page length * tracing: Fix crash when it fails to alloc ring buffer * tracing: Fix possible double free on failure of allocating trace buffer * tracing: Remove extra zeroing out of the ring buffer page net: mvneta: clear interface link status on port disable powerpc/perf: Dereference BHRB entries safely KVM: X86: Fix load RFLAGS w/o the fixed bit parisc: Hide Diva-built-in serial aux and graphics card * PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() * ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU * ALSA: rawmidi: Avoid racy info ioctl via ctl device mfd: twl6040: Fix child-node lookup mfd: twl4030-audio: Fix sibling-node lookup crypto: mcryptd - protect the per-CPU queue with a lock ACPI: APEI / ERST: Fix missing error handling in erst_reader() Linux 3.18.90 fm10k: ensure we process SM mbx when processing VF mbx scsi: lpfc: PLOGI failures during NPIV testing scsi: lpfc: Fix secure firmware updates PCI/AER: Report non-fatal errors only to the affected endpoint igb: check memory allocation failure PCI: Create SR-IOV virtfn/physfn links before attaching driver scsi: cxgb4i: fix Tx skb leak * PCI: Avoid bus reset if bridge itself is broken net: phy: at803x: Change error to EINVAL for invalid MAC crypto: crypto4xx - increase context and scatter ring buffer elements backlight: pwm_bl: Fix overflow condition cpuidle: powernv: Pass correct drv->cpumask for registration ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory * xhci: plat: Register shutdown for xhci_plat isdn: kcapi: avoid uninitialized data ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to register netfilter: nfnl_cthelper: fix a race when walk the nf_ct_helper_hash table irda: vlsi_ir: fix check for DMA mapping errors i40e: Do not enable NAPI on q_vectors that have no rings * net: Do not allow negative values for busy_read and busy_poll sysctl interfaces s390/qeth: no ETH header for outbound AF_IUCV * HID: xinmo: fix for out of range for THT 2P arcade controller. hwmon: (asus_atk0110) fix uninitialized data access ARM: dts: ti: fix PCI bus dtc warnings KVM: x86: correct async page present tracepoint scsi: lpfc: Fix PT2PT PRLI reject netfilter: nfnl_cthelper: Fix memory leak netfilter: nfnl_cthelper: fix runtime expectation policy updates usb: gadget: udc: remove pointer dereference after free usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed net: qmi_wwan: Add USB IDs for MDM6600 modem on Motorola Droid 4 * crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex * r8152: fix the list rx_done may be used without initialization * cpuidle: Validate cpu_dev in cpuidle_add_sysfs() ALSA: hda - add support for docking station for HP 820 G2 * arm64: Initialise high_memory global variable earlier Linux 3.18.89 usb: musb: da8xx: fix babble condition handling ath9k: fix tx99 potential info leak macvlan: Only deliver one copy of the frame to the macvlan interface udf: Avoid overflow when session starts at large offset scsi: bfa: integer overflow in debugfs * scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry raid5: Set R5_Expanded on parity devices as well as data. * pinctrl: adi2: Fix Kconfig build problem * tty fix oops when rmmod 8250 * PCI: Detach driver before procfs & sysfs teardown on device remove xfs: fix log block underflow during recovery cycle verification bcache: fix wrong cache_misses statistics bcache: explicitly destroy mutex while exiting GFS2: Take inode off order_write list when setting jdata flag * thermal/drivers/step_wise: Fix temperature regulation misbehavior * ppp: Destroy the mutex when cleanup clk: tegra: Fix cclk_lp divisor register * mm: Handle 0 flags in _calc_vm_trans() macro arm-ccn: perf: Prevent module unload while PMU is in use target/file: Do not return error for UNMAP if length is zero target:fix condition return in core_pr_dump_initiator_port() iscsi-target: fix memory leak in lio_target_tiqn_addtpg() target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd() powerpc/ipic: Fix status get and status clear powerpc/opal: Fix EBUSY bug in acquiring tokens powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo PCI/PME: Handle invalid data when reading Root Status video: fbdev: au1200fb: Return an error code if a memory allocation fails video: fbdev: au1200fb: Release some resources if a memory allocation fails video: udlfb: Fix read EDID timeout fbdev: controlfb: Add missing modes to fix out of bounds access target: Use system workqueue for ALUA transitions btrfs: add missing memset while reading compressed inline extents NFSv4.1 respect server's max size in CREATE_SESSION perf symbols: Fix symbols__fixup_end heuristic for corner cases afs: Fix afs_kill_pages() afs: Fix page leak in afs_write_begin() afs: Populate and use client modification time afs: Fix the maths in afs_fs_store_data() afs: Flush outstanding writes when an fd is closed afs: Adjust mode bits processing afs: Populate group ID from vnode status afs: Fix missing put_page() drm/radeon: reinstate oland workaround for sclk * sched/deadline: Use deadline instead of period when calculating overflow drm/radeon/si: add dpm quirk for Oland openrisc: fix issue handling 8 byte get_user calls * net: Resend IGMP memberships upon peer notification. * dmaengine: Fix array index out of bounds warning in __get_unmap_pool() net: wimax/i2400m: fix NULL-deref at probe Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list NFSD: fix nfsd_reset_versions for NFSv4. NFSD: fix nfsd_minorversion(.., NFSD_AVAIL) net: bcmgenet: Power up the internal PHY before probing the MII net: bcmgenet: correct MIB access of UniMAC RUNT counters net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values usb: phy: isp1301: Add OF device ID table mac80211: Fix addition of mesh configuration element * KEYS: Don't permit request_key() to construct a new keyring * Don't leak a key reference if request_key() tries to use a revoked keyring * ext4: fix crash when a directory's i_size is too small * xhci: Don't add a virt_dev to the devs array before it's fully allocated usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer * USB: core: prevent malicious bNumInterfaces overflow * USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID autofs: fix careless error in recent commit crypto: salsa20 - fix blkcipher_walk API usage * crypto: hmac - require that the underlying hash algorithm is unkeyed Linux 3.18.88 * usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one * audit: ensure that 'audit=1' actually enables audit for PID 1 afs: Connect up the CB.ProbeUuid IB/mlx5: Assign send CQ and recv CQ of UMR QP IB/mlx4: Increase maximal message size under UD QP * xfrm: Copy policy family in clone_policy atm: horizon: Fix irq release error sctp: use the right sk after waking up from wait_buf sleep sctp: do not free asoc when it is already dead in sctp_sendmsg sparc64/mm: set fields in deferred pages sunrpc: Fix rpc_task_begin trace point NFS: Fix a typo in nfs_rename() dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0 * lib/genalloc.c: make the avail variable an atomic_long_t * route: update fnhe_expires for redirect when the fnhe exists * route: also update fnhe_genid when updating a route cache EDAC, i5000, i5400: Fix definition of NRECMEMB register EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro axonram: Fix gendisk handling i2c: riic: fix restart condition crypto: s5p-sss - Fix completing crypto request in IRQ handler * ipv6: reorder icmpv6_init() and ip6_mr_init() bnx2x: fix possible overrun of VFPF multicast addresses array spi_ks8995: fix "BUG: key accdaa28 not in .data!" arm: KVM: Survive unknown traps from guests KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset irqchip/crossbar: Fix incorrect type of register size scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters * workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq libata: drop WARN from protocol error in ata_sff_qc_issue() USB: gadgetfs: Fix a potential memory leak in 'dev_config()' usb: gadget: configs: plug memory leak selftest/powerpc: Fix false failures for skipped tests Revert "s390/kbuild: enable modversions for symbols exported from asm" * Revert "drm/armada: Fix compile fail" * net/packet: fix a race in packet_bind() and packet_notifier() * sit: update frag_off info rds: Fix NULL pointer dereference in __rds_rdma_map * arm64: fpsimd: Prevent registers leaking from dead tasks KVM: VMX: remove I/O port 0x80 bypass on Intel hosts * arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one media: dvb: i2c transfers over usb cannot be done from stack kdb: Fix handling of kallsyms_symbol_next() return value iommu/vt-d: Fix scatterlist offset handling * ALSA: usb-audio: Add check return value for usb_string() * ALSA: usb-audio: Fix out-of-bound error ALSA: seq: Remove spurious WARN_ON() at timer check * ALSA: pcm: prevent UAF in snd_pcm_info x86/PCI: Make broadcom_postcore_init() check acpi_disabled * X.509: reject invalid BIT STRING for subjectPublicKey * KEYS: add missing permission check for request_key() destination * ASN.1: check for error from ASN1_OP_END__ACT actions * efi: Move some sysfs files to be read-only by root isa: Prevent NULL dereference in isa_bus driver callbacks hv: kvp: Avoid reading past allocated blocks from KVP file virtio: release virtio index when fail to device_register can: usb_8dev: cancel urb on -EPIPE and -EPROTO can: esd_usb2: cancel urb on -EPIPE and -EPROTO can: ems_usb: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: ratelimit errors if incomplete messages are received can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback() can: kvaser_usb: free buf in error paths Linux 3.18.87 usb: host: fix incorrect updating of offset * USB: usbfs: Filter flags passed in from user space * USB: devio: Prevent integer overflow in proc_do_submiturb() * USB: Increase usbfs transfer limit * usb: hub: Cycle HUB power when initialization fails serial: 8250_pci: Add Amazon PCI serial device ID * usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices ima: fix hash algorithm initialization net: fec: fix multicast filtering hardware setup * mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers tipc: fix cleanup at module unload net: sctp: fix array overrun read on sctp_timer_tbl NFSv4: Fix client recovery when server reboots multiple times net/appletalk: Fix kernel memory disclosure * vti6: fix device register to report IFLA_INFO_KIND ARM: OMAP1: DMA: Correct the number of logical channels perf test attr: Fix ignored test case result * sysrq : fix Show Regs call trace on ARM EDAC, sb_edac: Fix missing break in switch spi: sh-msiof: Fix DMA transfer size check serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() bcache: recover data from backing when data is clean bcache: only permit to recovery read error when cache device is clean Linux 3.18.86 drm/i915: Prevent zero length "index" write drm/i915: Don't try indexed reads to alternate slave addresses NFS: revalidate "." etc correctly on "open". drm/panel: simple: Add missing panel_simple_unprepare() calls eeprom: at24: check at24_read/write arguments KVM: x86: inject exceptions produced by x86_decode_insn KVM: x86: Exit to user-mode on #UD intercept when emulator requires btrfs: clear space cache inode generation always * mm/madvise.c: fix madvise() infinite loop under special circumstances mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d() * ipsec: Fix aborted xfrm policy dump crash * netlink: add a start callback for starting a netlink dump Linux 3.18.85 xen: xenbus driver must not accept invalid transaction ids s390/kbuild: enable modversions for symbols exported from asm ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data btrfs: return the actual error value from from btrfs_uuid_tree_iterate netfilter: nf_tables: fix oob access netfilter: nft_queue: use raw_smp_processor_id() staging: iio: cdc: fix improper return value mac80211: Suppress NEW_PEER_CANDIDATE event if no room mac80211: Remove invalid flag operations in mesh TSF synchronization ALSA: hda - Apply ALC269_FIXUP_NO_SHUTUP on HDA_FIXUP_ACT_PROBE * drm/armada: Fix compile fail net: 3com: typhoon: typhoon_init_one: fix incorrect return values net: 3com: typhoon: typhoon_init_one: make return values more specific * PCI: Apply _HPX settings only to relevant devices RDS: RDMA: return appropriate error on rdma map failures e1000e: Separate signaling for link check/link up e1000e: Fix return value test e1000e: Fix error path in link detection iio: iio-trig-periodic-rtc: Free trigger resource correctly * USB: fix buffer overflows with parsing CDC headers mtd: nand: Fix writing mtdoops to nand flash. net/9p: Switch to wait_event_killable() * media: v4l2-ctrl: Fix flags field on Control events media: rc: check for integer overflow media: Don't do DMA on stack for firmware upload in the AS102 driver powerpc/signal: Properly handle return value from uprobe_deny_signal() parisc: Fix validity check of pointer size argument in new CAS implementation ixgbe: Fix skb list corruption on Power systems fm10k: Use smp_rmb rather than read_barrier_depends i40evf: Use smp_rmb rather than read_barrier_depends ixgbevf: Use smp_rmb rather than read_barrier_depends igbvf: Use smp_rmb rather than read_barrier_depends igb: Use smp_rmb rather than read_barrier_depends i40e: Use smp_rmb rather than read_barrier_depends * time: Always make sure wall_to_monotonic isn't positive NFC: fix device-allocation error return IB/srpt: Do not accept invalid initiator port names clk: ti: dra7-atl-clock: fix child-node lookups clk: ti: dra7-atl-clock: Fix of_node reference counting KVM: SVM: obey guest PAT KVM: nVMX: set IDTR and GDTR limits when loading L1 host state iscsi-target: Fix non-immediate TMR reference leak fs/9p: Compare qid.path in v9fs_test_inode * ALSA: timer: Remove kernel warning at compat ioctl error paths * ALSA: usb-audio: Add sanity checks in v2 clock parsers * ALSA: usb-audio: Fix potential out-of-bound access at parsing SU * ALSA: usb-audio: Add sanity checks to FE parser * ext4: fix interaction between i_size, fallocate, and delalloc after a crash nfsd: deal with revoked delegations appropriately nfs: Fix ugly referral attributes NFS: Fix typo in nomigration mount option isofs: fix timestamps beyond 2027 bcache: check ca->alloc_thread initialized before wake up it eCryptfs: use after free in ecryptfs_release_messaging() nilfs2: fix race condition that causes file system corruption autofs: don't fail mount for transient error MIPS: BCM47XX: Fix LED inversion for WRT54GSv1 MIPS: Fix an n32 core file generation regset support regression * dm: fix race between dm_get_from_kobject() and __dm_destroy() * dm bufio: fix integer overflow when limiting maximum cache size ALSA: hda: Add Raven PCI ID ARM: 8721/1: mm: dump: check hardware RO bit for LPAE x86/decoder: Add new TEST instruction pattern * lib/mpi: call cond_resched() from mpi_powm() loop * sched: Make resched_cpu() unconditional * ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER s390/disassembler: increase show_code buffer size Linux 3.18.84 coda: fix 'kernel memory exposure attempt' in fsync ipmi: fix unsigned long underflow ocfs2: should wait dio before inode lock in ocfs2_setattr() ima: do not update security.ima if appraisal status is not INTEGRITY_PASS vlan: fix a use-after-free in vlan_device_event() * af_netlink: ensure that NLMSG_DONE never fails in dumps fealnx: Fix building error on MIPS sctp: do not peel off an assoc from one netns to another one * netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed * tcp: do not mangle skb->cb[] in tcp_make_synack() net/sctp: Always set scope_id in sctp_inet6_skb_msgname * ipv6/dccp: do not inherit ipv6_mc_list from parent Linux 3.18.83 USB: serial: garmin_gps: fix memory leak on probe errors USB: serial: garmin_gps: fix I/O after failed probe and remove USB: serial: garmin_gps: fix memory leak on failed URB submit USB: serial: qcserial: add pid/vid for Sierra Wireless EM7355 fw update * USB: Add delay-init quirk for Corsair K70 LUX keyboards * USB: usbfs: compute urb->actual_length for isochronous uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/rds.h userspace compilation error Revert "uapi: fix linux/rds.h userspace compilation errors" * Revert "crypto: xts - Add ECB dependency" MIPS: Netlogic: Exclude netlogic,xlp-pic code from XLR builds MIPS: init: Ensure reserved memory regions are not added to bootmem MIPS: End asm function prologue macros with .insn ixgbe: handle close/suspend race with netif_device_detach/present ixgbe: fix AER error handling gpu: drm: mgag200: mgag200_main:- Handle error from pci_iomap backlight: adp5520: Fix error handling in adp5520_bl_probe() * backlight: lcd: Fix race condition during register ALSA: vx: Fix possible transfer overflow ALSA: vx: Don't try to update capture stream before running scsi: lpfc: Correct issue leading to oops during link reset scsi: lpfc: Correct host name in symbolic_name field scsi: lpfc: FCoE VPort enable-disable does not bring up the VPort scsi: lpfc: Add missing memory barrier staging: rtl8188eu: fix incorrect ERROR tags from logs igb: Fix hw_dbg logging in igb_update_flash_i210 igb: close/suspend race in netif_device_detach igb: reset the PHY before reading the PHY ID drm/sti: sti_vtg: Handle return NULL error from devm_ioremap_nocache * ata: SATA_MV should depend on HAS_DMA * ata: SATA_HIGHBANK should depend on HAS_DMA * ata: ATA_BMDMA should depend on HAS_DMA ARM: dts: Fix omap3 off mode pull defines ARM: OMAP2+: Fix init for multiple quirks for the same SoC extcon: palmas: Check the parent instance to prevent the NULL iscsi-target: Fix iscsi_np reset hung task during parallel delete media: dib0700: fix invalid dvb_detach argument media: imon: Fix null-ptr-deref in imon_probe Linux 3.18.82 target/iscsi: Fix iSCSI task reassignment handling * security/keys: add CONFIG_KEYS_COMPAT to Kconfig ip6_gre: only increase err_count for some certain type icmpv6 in ip6gre_err ipip: only increase err_count for some certain type icmp in ipip_err * ipv6: flowlabel: do not leave opt->tot_len with garbage sctp: reset owner sk for data chunks on out queues when migrating a sock * tun: allow positive return values on dev_get_valid_name() call net/unix: don't show information about sockets from other namespaces sctp: add the missing sock_owned_by_user check in sctp_icmp_redirect * tun: call dev_get_valid_name() before register_netdevice() * l2tp: check ps->sock before running pppol2tp_session_ioctl() * tcp: fix tcp_mtu_probe() vs highest_sack * tun/tap: sanitize TUNSETSNDBUF input Revert "ARM: dts: imx53-qsb-common: fix FEC pinmux config" Input: ims-psu - check if CDC union descriptor is sane usb: usbtest: fix NULL pointer dereference mac80211: don't compare TKIP TX MIC key in reinstall prevention mac80211: use constant time comparison with keys mac80211: accept key reinstall without changing anything Revert "ceph: unlock dangling spinlock in try_flush_caps()" Linux 3.18.81 x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context can: c_can: don't indicate triple sampling support for D_CAN rbd: use GFP_NOIO for parent stat and data requests MIPS: AR7: Ensure that serial ports are properly set up MIPS: Fix CM region target definitions MIPS: microMIPS: Fix incorrect mask in insn_table_MM ALSA: seq: Avoid invalid lockdep class warning ALSA: seq: Fix OSS sysex delivery in OSS emulation ARM: 8720/1: ensure dump_instr() checks addr_limit * KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] crypto: x86/sha1-mb - fix panic due to unaligned access KEYS: trusted: fix writing past end of buffer in trusted_read() KEYS: trusted: sanitize all key material IB/ipoib: Change list_del to list_del_init in the tx object Input: mpr121 - set missing event capability Input: mpr121 - handle multiple bits change of status register * IPsec: do not ignore crypto err in ah4 input * usb: hcd: initialize hcd->flags to 0 when rm hcd serial: sh-sci: Fix register offsets for the IRDA serial port * phy: increase size of MII_BUS_ID_SIZE and bus_id dt-bindings: Add vendor prefix for LEGO dt-bindings: Add LEGO MINDSTORMS EV3 compatible specification iio: trigger: free trigger resource correctly ARM: omap2plus_defconfig: Fix probe errors on UARTs 5 and 6 drm: drm_minor_register(): Clean up debugfs on failure ARM: dts: imx53-qsb-common: fix FEC pinmux config xen/netback: set default upper limit of tx/rx queues to 8 video: fbdev: pmag-ba-fb: Remove bad `__init' annotation Linux 3.18.80 staging: r8712u: Fix Sparse warning in rtl871x_xmit.c xen: don't print error message in case of missing Xenstore entry bt8xx: fix memory leak s390/dasd: check for device error pointer within state change interrupts staging: lustre: ptlrpc: skip lock if export failed staging: lustre: hsm: stack overrun in hai_dump_data_field platform/x86: intel_mid_thermal: Fix module autoload xen/manage: correct return value check on xenbus_scanf() cx231xx: Fix I2C on Internal Master 3 Bus i2c: riic: correctly finish transfers * ext4: do not use stripe_width if it is not set * ext4: fix stripe-unaligned allocations staging: rtl8712u: Fix endian settings for structs describing network packets mmc: s3cmci: include linux/interrupt.h for tasklet_struct x86/microcode/intel: Disable late loading on model 79 drm/msm: fix an integer overflow test drm/msm: Fix potential buffer overflow issue ocfs2: fstrim: Fix start offset of first cluster group during fstrim ARM: 8715/1: add a private asm/unaligned.h * arm64: ensure __dump_instr() checks addr_limit ASoC: adau17x1: Workaround for noise bug in ADC * KEYS: fix out-of-bounds read during ASN.1 parsing * KEYS: return full count in keyring_read() if buffer is too small cifs: check MaxPathNameComponentLength != 0 before using it ALSA: seq: Fix nested rwsem annotation for lockdep splat * ALSA: timer: Add missing mutex lock for compat ioctls * blk-mq: fix race between timeout and freeing request Linux 3.18.79 * ecryptfs: fix dereference of NULL user_key_payload can: kvaser_usb: Correct return value in printout * scsi: sg: Re-fix off by one in sg_fill_request_table() scsi: zfcp: fix erp_action use-before-initialize in REC action trace * assoc_array: Fix a buggy node-splitting case Input: gtco - fix potential out-of-bound access * fuse: fix READDIRPLUS skipping an entry * spi: uapi: spidev: add missing ioctl header * usb: xhci: Handle error condition in xhci_stop_device() ceph: unlock dangling spinlock in try_flush_caps() Linux 3.18.78 FS-Cache: fix dereference of NULL user_key_payload * af_packet: don't pass empty blocks for PACKET_V3 parisc: Fix double-word compare and exchange in LWS code on 32-bit kernels parisc: Avoid trashing sr2 and sr3 in LWS code * cls_api.c: Fix dumping of non-existing actions' stats. * KEYS: don't let add_key() update an uninstantiated key lib/digsig: fix dereference of NULL user_key_payload * KEYS: encrypted: fix dereference of NULL user_key_payload bus: mbus: fix window size calculation for 4GB windows brcmsmac: make some local variables 'static const' to reduce stack size i2c: ismt: Separate I2C block read from SMBus block read ALSA: hda: Remove superfluous '-' added by printk conversion ALSA: seq: Enable 'use' locking in all configurations can: esd_usb2: Fix can_dlc value for received RTR, frames can: gs_usb: fix busy loop if no more TX context is available * usb: hub: Allow reset retry for USB2 devices on connect bounce * usb: quirks: add quirk for WORLDE MINI MIDI keyboard usb: cdc_acm: Add quirk for Elatec TWN3 USB: serial: metro-usb: add MS7820 device id * USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor() * USB: devio: Revert "USB: devio: Don't corrupt user memory" Linux 3.18.77 Revert "tty: goldfish: Fix a parameter of a call to free_irq" target/iscsi: Fix unsolicited data seq_end_offset calculation * uapi: fix linux/mroute6.h userspace compilation errors uapi: fix linux/rds.h userspace compilation errors scsi: scsi_dh_emc: return success in clariion_std_inquiry() ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock * crypto: xts - Add ECB dependency net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs Btrfs: send, fix failure to rename top level inode due to name collision iio: adc: xilinx: Fix error handling * netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. irqchip/crossbar: Fix incorrect type of local variables watchdog: kempld: fix gcc-4.3 build locking/lockdep: Add nest_lock integrity test Revert "bsg-lib: don't free job in bsg_prepare_job" * net: Set sk_prot_creator when cloning sockets to the right proto * packet: in packet_do_bind, test fanout with bind_lock held * l2tp: fix race condition in l2tp_tunnel_delete * l2tp: Avoid schedule while atomic in exit_net * vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit isdn/i4l: fetch the ppp_write buffer in one shot * packet: hold bind lock when rebinding to fanout hook bpf/verifier: reject BPF_ALU64|BPF_END * sctp: potential read out of bounds in sctp_ulpevent_type_enabled() * ext4: avoid deadlock when expanding inode size drm/dp/mst: save vcpi with payloads x86/mm: Disable preemption during CR3 read+write Linux 3.18.76 Revert "usb: gadget: inode.c: fix unbalanced spin_lock in ep0_write" ALSA: seq: Fix missing NULL check at remove_events ioctl USB: serial: console: fix use-after-free after failed setup USB: serial: qcserial: add Dell DW5818, DW5819 USB: serial: option: add support for TP-Link LTE module USB: serial: cp210x: add support for ELV TFD500 * fix unbalanced page refcounting in bio_map_user_iov * direct-io: Prevent NULL pointer access in submit_page_section * usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options ALSA: caiaq: Fix stray URB at probe error path ALSA: seq: Fix copy_from_user() call inside lock ALSA: seq: Fix use-after-free at creating a port * ALSA: usb-audio: Kill stray URB at exiting iommu/amd: Finish TLB flush in amd_iommu_unmap() usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit * crypto: shash - Fix zero-length shash ahash digest crash * HID: usbhid: fix out-of-bounds bug CIFS: Reconnect expired SMB sessions * ext4: in ext4_seek_{hole,data}, return -ENXIO for negative offsets Linux 3.18.75 * ext4: fix fencepost in s_first_meta_bg validation * ext4: validate s_first_meta_bg at mount time ext4: Don't clear SGID when inheriting ACLs * ext4: fix data corruption for mmap writes * fs/super.c: fix race between freeze_super() and thaw_super() * ext4: only call ext4_truncate when size <= isize drm/i915/bios: ignore HDMI on port A HID: i2c-hid: allocate hid buffers for real worst case * driver core: platform: Don't read past the end of "driver_override" buffer ALSA: usx2y: Suppress kernel warning at page allocation failures * lsm: fix smack_inode_removexattr and xattr_getsecurity memleak uwb: ensure that endpoint is interrupt uwb: properly check kthread_run return value iio: adc: mcp320x: Fix oops on module unload iio: ad7793: Fix the serial interface reset * iio: core: Return error for failed read_reg staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack. iio: ad_sigma_delta: Implement a dedicated reset function * xhci: fix finding correct bus_state structure for USB 3.1 hosts * USB: fix out-of-bounds in usb_set_configuration * usb: Increase quirk delay for USB devices USB: uas: fix bug in handling of alternate settings * USB: devio: Don't corrupt user memory USB: dummy-hcd: fix infinite-loop resubmission bug USB: dummy-hcd: fix connection failures (wrong speed) * usb: pci-quirks.c: Corrected timeout values used in handshake * ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe * usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives USB: gadgetfs: fix copy_to_user while holding spinlock USB: gadgetfs: Fix crash caused by inadequate synchronization usb: gadget: inode.c: fix unbalanced spin_lock in ep0_write Linux 3.18.74 * mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] crypto: algif_skcipher - Load TX SG list after waiting staging: nvec: remove duplicated const ttpci: address stringop overflow warning ALSA: au88x0: avoid theoretical uninitialized access IB/qib: fix false-postive maybe-uninitialized warning libata: transport: Remove circular dependency at free time xfs: remove kmem_zalloc_greedy md/raid10: submit bio directly to replacement disk rds: ib: add error handle parisc: perf: Fix potential NULL pointer dereference netfilter: nfnl_cthelper: fix incorrect helper->expect_class_max exynos-gsc: Do not swap cb/cr for semi planar formats * netfilter: invoke synchronize_rcu after set the _hook_ to NULL * mmc: sdio: fix alignment issue in struct sdio_func * usb: plusb: Add support for PL-27A1 team: fix memory leaks * net/packet: check length in getsockopt() called with PACKET_HDRLEN * net: core: Prevent from dereferencing null pointer when releasing SKB * audit: log 32-bit socketcalls * partitions/efi: Fix integer overflow in GPT size calculation USB: serial: mos7840: fix control-message error handling USB: serial: mos7720: fix control-message error handling IB/ipoib: Replace list_del of the neigh->list with list_del_init IB/ipoib: rtnl_unlock can not come after free_netdev IB/ipoib: Fix deadlock over vlan_mutex tty: goldfish: Fix a parameter of a call to free_irq ARM: 8635/1: nommu: allow enabling REMAP_VECTORS_TO_RAM hwmon: (gl520sm) Fix overflows and crash seen when writing into limit attributes sh_eth: use correct name for ECMR_MPDE bit MIPS: Ensure bss section ends on a long-aligned address RDS: RDMA: Fix the composite message user notification drm: bridge: add DT bindings for TI ths8135 Linux 3.18.73 fix xen_swiotlb_dma_mmap prototype swiotlb-xen: implement xen_swiotlb_dma_mmap callback video: fbdev: aty: do not leak uninitialized padding in clk to userspace x86/fpu: Don't let userspace set bogus xcomp_bv btrfs: prevent to set invalid default subvolid * PCI: Fix race condition with driver_override kvm: nVMX: Don't allow L2 to access the hardware CR8 * arm64: Make sure SPsel is always set bsg-lib: don't free job in bsg_prepare_job * nl80211: check for the required netlink attributes presence * vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags SMB: Validate negotiate (to protect against downgrade) even if signing off powerpc/pseries: Fix parent_dn reference leak in add_dt_node() * KEYS: prevent KEYCTL_READ on negative key * KEYS: prevent creating a different user's keyrings * KEYS: fix writing past end of user-supplied buffer in keyring_read() crypto: talitos - fix sha224 scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly * tracing: Erase irqsoff trace with empty write * tracing: Fix trace_pipe behavior for instance traces KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() mac80211: flush hw_roc_start work before cancelling the ROC cifs: release auth_key.response for reconnect. cifs: release cifs root_cred after exit_cifs Linux 3.18.72 bcache: fix bch_hprint crash and improve output bcache: fix for gc and write-back race bcache: Correct return value for sysfs attach errors bcache: correct cache_dirty_target in __update_writeback_rate() bcache: Fix leak of bdev reference bcache: initialize dirty stripes in flash_dev_run() media: uvcvideo: Prevent heap overflow when accessing mapped controls * media: v4l2-compat-ioctl32: Fix timespec conversion PCI: shpchp: Enable bridge bus mastering if MSI is enabled ARC: Re-enable MMU upon Machine Check exception * tracing: Apply trace_clock changes to instance max buffer ftrace: Fix selftest goto location on error scsi: qla2xxx: Fix an integer overflow in sysfs code * scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE * scsi: sg: factor out sg_fill_request_table() * scsi: sg: off by one in sg_ioctl() * scsi: sg: use standard lists for sg_requests * scsi: sg: remove 'save_scat_len' scsi: zfcp: trace high part of "new" 64 bit SCSI LUN scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records scsi: zfcp: fix missing trace records for early returns in TMF eh handlers scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled skd: Submit requests to firmware before triggering the doorbell skd: Avoid that module unloading triggers a use-after-free md/bitmap: disable bitmap_resize for file-backed bitmaps. * block: Relax a check in blk_start_queue() powerpc: Fix DAR reporting when alignment handler faults * ext4: fix incorrect quotaoff if the quota feature is enabled crypto: AF_ALG - remove SGL terminator indicator when chaining Input: i8042 - add Gigabyte P57 to the keyboard reset table ip6_gre: fix endianness errors in ip6gre_err Revert "usb: musb: fix tx fifo flush handling again" f2fs: check hot_data for roll-forward recovery * ipv6: fix typo in fib6_net_exit() * ipv6: fix memory leak with multiple tables during netns destruction * tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0 * Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()" qlge: avoid memcpy buffer overflow * ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt() Linux 3.18.71 xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present ARM: 8692/1: mm: abort uaccess retries upon fatal signal * Bluetooth: Properly check L2CAP config option output buffer length ALSA: msnd: Optimize / harden DSP and MIDI loops locktorture: Fix potential memory leak with rw lock test btrfs: resume qgroup rescan on rw remount * scsi: sg: recheck MMAP_IO request length with lock held * scsi: sg: protect against races between mmap() and SG_SET_RESERVED_SIZE * cs5536: add support for IDE controller variant * workqueue: Fix flag collision * cma: fix calculation of aligned offset dlm: avoid double-free on error path in dlm_device_{register,unregister} Input: trackpoint - assume 3 buttons when buttons detection fails * driver core: bus: Fix a potential double free staging/rts5208: fix incorrect shift to extract upper nybble * USB: core: Avoid race of async_completed() w/ usbdev_release() * usb:xhci:Fix regression when ATI chipsets detected * usb: Add device quirk for Logitech HD Pro Webcam C920-C USB: serial: option: add support for D-Link DWM-157 C1 * usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard Conflicts: drivers/input/input.c drivers/media/v4l2-core/v4l2-compat-ioctl32.c drivers/scsi/sg.c drivers/usb/dwc3/gadget.c drivers/usb/gadget/function/f_fs.c drivers/usb/host/xhci-hub.c net/ipv4/raw.c net/packet/af_packet.c sound/usb/card.c sound/usb/mixer.c Change-Id: I4ca2d8f23d99e69b73d055262327f4c71da20a7c Signed-off-by: Thierry Strudel <tstrudel@google.com>
| * net: tcp: close sock if net namespace is exitingDan Streetman2018-01-311-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ] When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence. For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like: unregister_netdevice: waiting for lo to become free. Usage count = 1 These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting. After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811 Signed-off-by: Dan Streetman <ddstreet@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge linux-3.18.70 into android-msm-marlin-3.18Thierry Strudel2017-09-121-1/+2
|\| | | | | | | | | Change-Id: Ifbed5d4275df07fa37f66c873eab5740228e422a Signed-off-by: Thierry Strudel <tstrudel@google.com>
| * net: fix keepalive code vs TCP_FASTOPEN_CONNECTEric Dumazet2017-08-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2dda640040876cd8ae646408b69eea40c24f9ae9 ] syzkaller was able to trigger a divide by 0 in TCP stack [1] Issue here is that keepalive timer needs to be updated to not attempt to send a probe if the connection setup was deferred using TCP_FASTOPEN_CONNECT socket option added in linux-4.11 [1] divide error: 0000 [#1] SMP CPU: 18 PID: 0 Comm: swapper/18 Not tainted task: ffff986f62f4b040 ti: ffff986f62fa2000 task.ti: ffff986f62fa2000 RIP: 0010:[<ffffffff8409cc0d>] [<ffffffff8409cc0d>] __tcp_select_window+0x8d/0x160 Call Trace: <IRQ> [<ffffffff8409d951>] tcp_transmit_skb+0x11/0x20 [<ffffffff8409da21>] tcp_xmit_probe_skb+0xc1/0xe0 [<ffffffff840a0ee8>] tcp_write_wakeup+0x68/0x160 [<ffffffff840a151b>] tcp_keepalive_timer+0x17b/0x230 [<ffffffff83b3f799>] call_timer_fn+0x39/0xf0 [<ffffffff83b40797>] run_timer_softirq+0x1d7/0x280 [<ffffffff83a04ddb>] __do_softirq+0xcb/0x257 [<ffffffff83ae03ac>] irq_exit+0x9c/0xb0 [<ffffffff83a04c1a>] smp_apic_timer_interrupt+0x6a/0x80 [<ffffffff83a03eaf>] apic_timer_interrupt+0x7f/0x90 <EOI> [<ffffffff83fed2ea>] ? cpuidle_enter_state+0x13a/0x3b0 [<ffffffff83fed2cd>] ? cpuidle_enter_state+0x11d/0x3b0 Tested: Following packetdrill no longer crashes the kernel `echo 0 >/proc/sys/net/ipv4/tcp_timestamps` // Cache warmup: send a Fast Open cookie request 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 setsockopt(3, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation is now in progress) +0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop> +.01 < S. 123:123(0) ack 1 win 14600 <mss 1460,nop,nop,sackOK,nop,wscale 6,FO abcd1234,nop,nop> +0 > . 1:1(0) ack 1 +0 close(3) = 0 +0 > F. 1:1(0) ack 1 +0 < F. 1:1(0) ack 2 win 92 +0 > . 2:2(0) ack 2 +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4 +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 +.01 connect(4, ..., ...) = 0 +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [5], 4) = 0 +10 close(4) = 0 `echo 1 >/proc/sys/net/ipv4/tcp_timestamps` Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Big merge into 3.18.52Greg Kroah-Hartman2017-05-121-2/+4
|\| | | | | | | | | | | | | This merges from 3.18.44 to 3.18.52 all in one big chunk to make things go faster on the review side. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
| * tcp: fix various issues for sockets morphing to listen stateEric Dumazet2017-04-181-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 02b2faaf0af1d85585f6d6980e286d53612acfc2 upstream. Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting tcp_disconnect() path that was never really considered and/or used before syzkaller ;) I was not able to reproduce the bug, but it seems issues here are the three possible actions that assumed they would never trigger on a listener. 1) tcp_write_timer_handler 2) tcp_delack_timer_handler 3) MTU reduction Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN states from tcp_v6_mtu_reduced() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | WLAN subsystem: Sysctl support for key TCP/IP parametersRavi Joshi2015-09-151-0/+34
|/ | | | | | | | | | | | | It has been observed that default values for some of key tcp/ip parameters are affecting the tput/performance of the system. Hence extending configuration capabilities to TCP/Ip stack through sysctl interface Change-Id: I4287e9103769535f43e0934bac08435a524ee6a4 CRs-Fixed: 507581 Signed-off-by: Ravi Joshi <ravij@codeaurora.org> Signed-off-by: Ganesh Babu Kumaravel <kganesh@codeaurora.org> Signed-off-by: Mohit Khanna <mkhannaqca@codeaurora.org>
* tcp: abort orphan sockets stalling on zero window probesYuchung Cheng2014-10-011-20/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we have two different policies for orphan sockets that repeatedly stall on zero window ACKs. If a socket gets a zero window ACK when it is transmitting data, the RTO is used to probe the window. The socket is aborted after roughly tcp_orphan_retries() retries (as in tcp_write_timeout()). But if the socket was idle when it received the zero window ACK, and later wants to send more data, we use the probe timer to probe the window. If the receiver always returns zero window ACKs, icsk_probes keeps getting reset in tcp_ack() and the orphan socket can stall forever until the system reaches the orphan limit (as commented in tcp_probe_timer()). This opens up a simple attack to create lots of hanging orphan sockets to burn the memory and the CPU, as demonstrated in the recent netdev post "TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised." http://www.spinics.net/lists/netdev/msg296539.html This patch follows the design in RTO-based probe: we abort an orphan socket stalling on zero window when the probe timer reaches both the maximum backoff and the maximum RTO. For example, an 100ms RTT connection will timeout after roughly 153 seconds (0.3 + 0.6 + .... + 76.8) if the receiver keeps the window shut. If the orphan socket passes this check, but the system already has too many orphans (as in tcp_out_of_resources()), we still abort it but we'll also send an RST packet as the connection may still be active. In addition, we change TCP_USER_TIMEOUT to cover (life or dead) sockets stalled on zero-window probes. This changes the semantics of TCP_USER_TIMEOUT slightly because it previously only applies when the socket has pending transmission. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: avoid possible arithmetic overflowsEric Dumazet2014-09-221-2/+2
| | | | | | | | | | | | | | icsk_rto is a 32bit field, and icsk_backoff can reach 15 by default, or more if some sysctl (eg tcp_retries2) are changed. Better use 64bit to perform icsk_rto << icsk_backoff operations As Joe Perches suggested, add a helper for this. Yuchung spotted the tcp_v4_err() case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: remove TCP_SKB_CB(skb)->whenEric Dumazet2014-09-051-4/+3
| | | | | | | | | | | After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: reduce spurious retransmits due to transient SACK renegingNeal Cardwell2014-08-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit reduces spurious retransmits due to apparent SACK reneging by only reacting to SACK reneging that persists for a short delay. When a sequence space hole at snd_una is filled, some TCP receivers send a series of ACKs as they apparently scan their out-of-order queue and cumulatively ACK all the packets that have now been consecutiveyly received. This is essentially misbehavior B in "Misbehaviors in TCP SACK generation" ACM SIGCOMM Computer Communication Review, April 2011, so we suspect that this is from several common OSes (Windows 2000, Windows Server 2003, Windows XP). However, this issue has also been seen in other cases, e.g. the netdev thread "TCP being hoodwinked into spurious retransmissions by lack of timestamps?" from March 2014, where the receiver was thought to be a BSD box. Since snd_una would temporarily be adjacent to a previously SACKed range in these scenarios, this receiver behavior triggered the Linux SACK reneging code path in the sender. This led the sender to clear the SACK scoreboard, enter CA_Loss, and spuriously retransmit (potentially) every packet from the entire write queue at line rate just a few milliseconds before the ACK for each packet arrives at the sender. To avoid such situations, now when a sender sees apparent reneging it does not yet retransmit, but rather adjusts the RTO timer to give the receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs that will restore sanity to the SACK scoreboard. If the reneging persists until this RTO then, as before, we clear the SACK scoreboard and enter CA_Loss. A 10ms delay tolerates a receiver sending such a stream of ACKs at 56Kbit/sec. And to allow for receivers with slower or more congested paths, we wait for at least RTT/2. We validated the resulting max(RTT/2, 10ms) delay formula with a mix of North American and South American Google web server traffic, and found that for ACKs displaying transient reneging: (1) 90% of inter-ACK delays were less than 10ms (2) 99% of inter-ACK delays were less than RTT/2 In tests on Google web servers this commit reduced reneging events by 75%-90% (as measured by the TcpExtTCPSACKReneging counter), without any measurable impact on latency for user HTTP and SPDY requests. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: snmp stats for Fast Open, SYN rtx, and data pktsYuchung Cheng2014-03-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Add the following snmp stats: TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed beacuse the remote does not accept it or the attempts timed out. TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down retransmissions into SYN, fast-retransmits, timeout retransmits, etc. TCPOrigDataSent: number of outgoing packets with original data (excluding retransmission but including data-in-SYN). This counter is different from TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is more useful to track the TCP retransmission rate. Change TCPFastOpenActive to track only successful Fast Opens to be symmetric to TCPFastOpenPassive. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: Lawrence Brakmo <brakmo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: temporarily disable Fast Open on SYN timeoutYuchung Cheng2013-10-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Fast Open currently has a fall back feature to address SYN-data being dropped but it requires the middle-box to pass on regular SYN retry after SYN-data. This is implemented in commit aab487435 ("net-tcp: Fast Open client - detecting SYN-data drops") However some NAT boxes will drop all subsequent packets after first SYN-data and blackholes the entire connections. An example is in commit 356d7d8 "netfilter: nf_conntrack: fix tcp_in_window for Fast Open". The sender should note such incidents and fall back to use the regular TCP handshake on subsequent attempts temporarily as well: after the second SYN timeouts the original Fast Open SYN is most likely lost. When such an event recurs Fast Open is disabled based on the number of recurrences exponentially. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: make lookups simpler and fasterEric Dumazet2013-10-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: refactor F-RTOYuchung Cheng2013-03-211-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch series refactor the F-RTO feature (RFC4138/5682). This is to simplify the loss recovery processing. Existing F-RTO was developed during the experimental stage (RFC4138) and has many experimental features. It takes a separate code path from the traditional timeout processing by overloading CA_Disorder instead of using CA_Loss state. This complicates CA_Disorder state handling because it's also used for handling dubious ACKs and undos. While the algorithm in the RFC does not change the congestion control, the implementation intercepts congestion control in various places (e.g., frto_cwnd in tcp_ack()). The new code implements newer F-RTO RFC5682 using CA_Loss processing path. F-RTO becomes a small extension in the timeout processing and interfaces with congestion control and Eifel undo modules. It lets congestion control (module) determines how many to send independently. F-RTO only chooses what to send in order to detect spurious retranmission. If timeout is found spurious it invokes existing Eifel undo algorithms like DSACK or TCP timestamp based detection. The first patch removes all F-RTO code except the sysctl_tcp_frto is left for the new implementation. Since CA_EVENT_FRTO is removed, TCP westwood now computes ssthresh on regular timeout CA_EVENT_LOSS event. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: TLP loss detection.Nandita Dukkipati2013-03-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This is the second of the TLP patch series; it augments the basic TLP algorithm with a loss detection scheme. This patch implements a mechanism for loss detection when a Tail loss probe retransmission plugs a hole thereby masking packet loss from the sender. The loss detection algorithm relies on counting TLP dupacks as outlined in Sec. 3 of: http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01 The basic idea is: Sender keeps track of TLP "episode" upon retransmission of a TLP packet. An episode ends when the sender receives an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the episode. We want to make sure that before the episode ends the sender receives a "TLP dupack", indicating that the TLP retransmission was unnecessary, so there was no loss/hole that needed plugging. If the sender gets no TLP dupack before the end of the episode, then it reduces ssthresh and the congestion window, because the TLP packet arriving at the receiver probably plugged a hole. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Tail loss probe (TLP)Nandita Dukkipati2013-03-121-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch series implement the Tail loss probe (TLP) algorithm described in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The first patch implements the basic algorithm. TLP's goal is to reduce tail latency of short transactions. It achieves this by converting retransmission timeouts (RTOs) occuring due to tail losses (losses at end of transactions) into fast recovery. TLP transmits one packet in two round-trips when a connection is in Open state and isn't receiving any ACKs. The transmitted packet, aka loss probe, can be either new or a retransmission. When there is tail loss, the ACK from a loss probe triggers FACK/early-retransmit based fast recovery, thus avoiding a costly RTO. In the absence of loss, there is no change in the connection state. PTO stands for probe timeout. It is a timer event indicating that an ACK is overdue and triggers a loss probe packet. The PTO value is set to max(2*SRTT, 10ms) and is adjusted to account for delayed ACK timer when there is only one oustanding packet. TLP Algorithm On transmission of new data in Open state: -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms). -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms) -> PTO = min(PTO, RTO) Conditions for scheduling PTO: -> Connection is in Open state. -> Connection is either cwnd limited or no new data to send. -> Number of probes per tail loss episode is limited to one. -> Connection is SACK enabled. When PTO fires: new_segment_exists: -> transmit new segment. -> packets_out++. cwnd remains same. no_new_packet: -> retransmit the last segment. Its ACK triggers FACK or early retransmit based recovery. ACK path: -> rearm RTO at start of ACK processing. -> reschedule PTO if need be. In addition, the patch includes a small variation to the Early Retransmit (ER) algorithm, such that ER and TLP together can in principle recover any N-degree of tail loss through fast recovery. TLP is controlled by the same sysctl as ER, tcp_early_retrans sysctl. tcp_early_retrans==0; disables TLP and ER. ==1; enables RFC5827 ER. ==2; delayed ER. ==3; TLP and delayed ER. [DEFAULT] ==4; TLP only. The TLP patch series have been extensively tested on Google Web servers. It is most effective for short Web trasactions, where it reduced RTOs by 15% and improved HTTP response time (average by 6%, 99th percentile by 10%). The transmitted probes account for <0.5% of the overall transmissions. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-11-101-2/+2
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c Minor conflict between the BCM_CNIC define removal in net-next and a bug fix added to net. Based upon a conflict resolution patch posted by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: Reject invalid ack_seq to Fast Open socketsJerry Chu2012-10-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A packet with an invalid ack_seq may cause a TCP Fast Open socket to switch to the unexpected TCP_CLOSING state, triggering a BUG_ON kernel panic. When a FIN packet with an invalid ack_seq# arrives at a socket in the TCP_FIN_WAIT1 state, rather than discarding the packet, the current code will accept the FIN, causing state transition to TCP_CLOSING. This may be a small deviation from RFC793, which seems to say that the packet should be dropped. Unfortunately I did not expect this case for Fast Open hence it will trigger a BUG_ON panic. It turns out there is really nothing bad about a TFO socket going into TCP_CLOSING state so I could just remove the BUG_ON statements. But after some thought I think it's better to treat this case like TCP_SYN_RECV and return a RST to the confused peer who caused the unacceptable ack_seq to be generated in the first place. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: better retrans tracking for defer-acceptEric Dumazet2012-11-031-4/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For passive TCP connections using TCP_DEFER_ACCEPT facility, we incorrectly increment req->retrans each time timeout triggers while no SYNACK is sent. SYNACK are not sent for TCP_DEFER_ACCEPT that were established (for which we received the ACK from client). Only the last SYNACK is sent so that we can receive again an ACK from client, to move the req into accept queue. We plan to change this later to avoid the useless retransmit (and potential problem as this SYNACK could be lost) TCP_INFO later gives wrong information to user, claiming imaginary retransmits. Decouple req->retrans field into two independent fields : num_retrans : number of retransmit num_timeout : number of timeouts num_timeout is the counter that is incremented at each timeout, regardless of actual SYNACK being sent or not, and used to compute the exponential timeout. Introduce inet_rtx_syn_ack() helper to increment num_retrans only if ->rtx_syn_ack() succeeded. Use inet_rtx_syn_ack() from tcp_check_req() to increment num_retrans when we re-send a SYNACK in answer to a (retransmitted) SYN. Prior to this patch, we were not counting these retransmits. Change tcp_v[46]_rtx_synack() to increment TCP_MIB_RETRANSSEGS only if a synack packet was successfully queued. Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: Vijay Subramanian <subramanian.vijay@gmail.com> Cc: Elliott Hughes <enh@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: TCP Fast Open Server - support TFO listenersJerry Chu2012-08-311-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch builds on top of the previous patch to add the support for TFO listeners. This includes - 1. allocating, properly initializing, and managing the per listener fastopen_queue structure when TFO is enabled 2. changes to the inet_csk_accept code to support TFO. E.g., the request_sock can no longer be freed upon accept(), not until 3WHS finishes 3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg() if it's a TFO socket 4. properly closing a TFO listener, and a TFO socket before 3WHS finishes 5. supporting TCP_FASTOPEN socket option 6. modifying tcp_check_req() to use to check a TFO socket as well as request_sock 7. supporting TCP's TFO cookie option 8. adding a new SYN-ACK retransmit handler to use the timer directly off the TFO socket rather than the listener socket. Note that TFO server side will not retransmit anything other than SYN-ACK until the 3WHS is completed. The patch also contains an important function "reqsk_fastopen_remove()" to manage the somewhat complex relation between a listener, its request_sock, and the corresponding child socket. See the comment above the function for the detail. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: fix possible socket refcount problemEric Dumazet2012-08-211-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6f458dfb40 (tcp: improve latencies of timer triggered events) added bug leading to following trace : [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.131726] [ 2866.132188] ========================= [ 2866.132281] [ BUG: held lock freed! ] [ 2866.132281] 3.6.0-rc1+ #622 Not tainted [ 2866.132281] ------------------------- [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there! [ 2866.132281] (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] 4 locks held by kworker/0:1/652: [ 2866.132281] #0: (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #1: ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] #3: (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f [ 2866.132281] [ 2866.132281] stack backtrace: [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622 [ 2866.132281] Call Trace: [ 2866.132281] <IRQ> [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159 [ 2866.132281] [<ffffffff818a0839>] ? __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a [ 2866.132281] [<ffffffff818a0839>] __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff818a08c0>] sk_free+0x1c/0x1e [ 2866.132281] [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56 [ 2866.132281] [<ffffffff81078082>] run_timer_softirq+0x218/0x35f [ 2866.132281] [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f [ 2866.132281] [<ffffffff810f5831>] ? rb_commit+0x58/0x85 [ 2866.132281] [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148 [ 2866.132281] [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9 [ 2866.132281] [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e [ 2866.132281] [<ffffffff81a1227c>] call_softirq+0x1c/0x30 [ 2866.132281] [<ffffffff81039f38>] do_softirq+0x4a/0xa6 [ 2866.132281] [<ffffffff81070f2b>] irq_exit+0x51/0xad [ 2866.132281] [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4 [ 2866.132281] [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f [ 2866.132281] <EOI> [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1 [ 2866.132281] [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56 [ 2866.132281] [<ffffffff81078692>] mod_timer+0x178/0x1a9 [ 2866.132281] [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26 [ 2866.132281] [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4 [ 2866.132281] [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70 [ 2866.132281] [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4 [ 2866.132281] [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1 [ 2866.132281] [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a [ 2866.132281] [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6 [ 2866.132281] [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5 [ 2866.132281] [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4 [ 2866.132281] [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5 [ 2866.132281] [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f [ 2866.132281] [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43 [ 2866.132281] [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80 [ 2866.132281] [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0 [ 2866.132281] [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61 [ 2866.132281] [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1 [ 2866.132281] [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff81999d92>] call_transmit+0x1c5/0x20e [ 2866.132281] [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225 [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34 [ 2866.132281] [<ffffffff810835d6>] process_one_work+0x24d/0x47f [ 2866.132281] [<ffffffff81083567>] ? process_one_work+0x1de/0x47f [ 2866.132281] [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225 [ 2866.132281] [<ffffffff81083a6d>] worker_thread+0x236/0x317 [ 2866.132281] [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f [ 2866.132281] [<ffffffff8108b7b8>] kthread+0x9a/0xa2 [ 2866.132281] [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10 [ 2866.132281] [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13 [ 2866.132281] [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a [ 2866.132281] [<ffffffff81a12180>] ? gs_change+0x13/0x13 [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.309689] ============================================================================= [ 2866.310254] BUG TCP (Not tainted): Object already free [ 2866.310254] ----------------------------------------------------------------------------- [ 2866.310254] The bug comes from the fact that timer set in sk_reset_timer() can run before we actually do the sock_hold(). socket refcount reaches zero and we free the socket too soon. timer handler is not allowed to reduce socket refcnt if socket is owned by the user, or we need to change sk_reset_timer() implementation. We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED was used instead of TCP_DELACK_TIMER_DEFERRED. For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED, even if not fired from a timer. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: improve latencies of timer triggered eventsEric Dumazet2012-07-201-33/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modern TCP stack highly depends on tcp_write_timer() having a small latency, but current implementation doesn't exactly meet the expectations. When a timer fires but finds the socket is owned by the user, it rearms itself for an additional delay hoping next run will be more successful. tcp_write_timer() for example uses a 50ms delay for next try, and it defeats many attempts to get predictable TCP behavior in term of latencies. Use the recently introduced tcp_release_cb(), so that the user owning the socket will call various handlers right before socket release. This will permit us to post a followup patch to address the tcp_tso_should_defer() syndrome (some deferred packets have to wait RTO timer to be transmitted, while cwnd should allow us to send them sooner) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: early retransmit: delayed fast retransmitYuchung Cheng2012-05-021-0/+5
| | | | | | | | | | | | | Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2). Delays the fast retransmit by an interval of RTT/4. We borrow the RTO timer to implement the delay. If we receive another ACK or send a new packet, the timer is cancelled and restored to original RTO value offset by time elapsed. When the delayed-ER timer fires, we enter fast recovery and perform fast retransmit. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv4: Standardize prefixes for message loggingJoe Perches2012-03-121-6/+8
| | | | | | | | | | | | | | | | Add #define pr_fmt(fmt) as appropriate. Add "IPv4: ", "TCP: ", and "IPsec: " to appropriate files. Standardize on "UDPLite: " for appropriate uses. Some prefixes were previously "UDPLITE: " and "UDP-Lite: ". Add KBUILD_MODNAME ": " to icmp and gre. Remove embedded prefixes as appropriate. Add missing "\n" to pr_info in gre.c. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Disambiguate kernel messageArun Sharma2012-02-011-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Some of our machines were reporting: TCP: too many of orphaned sockets even when the number of orphaned sockets was well below the limit. We print a different message depending on whether we're out of TCP memory or there are too many orphaned sockets. Also move the check out of line and cleanup the messages that were printed. Signed-off-by: Arun Sharma <asharma@fb.com> Suggested-by: Mohan Srinivasan <mohan@fb.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: David Miller <davem@davemloft.net> Cc: Glauber Costa <glommer@parallels.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix assignment of 0/1 to bool variables.Rusty Russell2011-12-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | DaveM said: Please, this kind of stuff rots forever and not using bool properly drives me crazy. Joe Perches <joe@perches.com> gave me the spatch script: @@ bool b; @@ -b = 0 +b = false @@ bool b; @@ -b = 1 +b = true I merely installed coccinelle, read the documentation and took credit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* foundations of per-cgroup memory pressure controlling.Glauber Costa2011-12-121-1/+1
| | | | | | | | | | | | | | | | | This patch replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to acessor macros. Those macros can either receive a socket argument, or a mem_cgroup argument, depending on the context they live in. Since we're only doing a macro wrapping here, no performance impact at all is expected in the case where we don't have cgroups disabled. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: use IS_ENABLED(CONFIG_IPV6)Eric Dumazet2011-12-111-1/+1
| | | | | | | Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* TCP: remove TCP_DEBUGFlavio Leitner2011-10-241-2/+0
| | | | | | | | It was enabled by default and the messages guarded by the define are useful. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Remove debug macro of TCP_CHECK_TIMERShan Wei2011-02-201-3/+0
| | | | | | | | | | Now, TCP_CHECK_TIMER is not used for debuging, it does nothing. And, it has been there for several years, maybe 6 years. Remove it to keep code clearer. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: use correct counters in CA_CWR state tooIlpo Järvinen2010-10-171-6/+7
| | | | | | | | | | | | | | As CWR is stronger than CA_Disorder state, we can miscount SACK/Reno failure into other timeouts. Not a bad problem as it can happen only due to ECN, FRTO detecting spurious RTO or xmit error which are the only callers of tcp_enter_cwr. And even then losses and RTO must still follow thereafter to actually end up into the relevant code paths. Compile tested. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-10-041-11/+14
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/Kconfig net/ipv4/tcp_timer.c
| * net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()Damian Lukowski2010-09-281-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes kernel Bugzilla Bug 18952 This patch adds a syn_set parameter to the retransmits_timed_out() routine and updates its callers. If not set, TCP_RTO_MIN is taken as the calculation basis as before. If set, TCP_TIMEOUT_INIT is used instead, so that sysctl_syn_retries represents the actual amount of SYN retransmissions in case no SYNACKs are received when establishing a new connection. Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2010-09-091-4/+4
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/main.c
| * tcp: Combat per-cpu skew in orphan tests.David S. Miller2010-08-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported by Anton Blanchard when we use percpu_counter_read_positive() to make our orphan socket limit checks, the check can be off by up to num_cpus_online() * batch (which is 32 by default) which on a 128 cpu machine can be as large as the default orphan limit itself. Fix this by doing the full expensive sum check if the optimized check triggers. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
* | tcp: Add TCP_USER_TIMEOUT socket option.Jerry Chu2010-08-301-15/+25
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides a "user timeout" support as described in RFC793. The socket option is also needed for the the local half of RFC5482 "TCP User Timeout Option". TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int, when > 0, to specify the maximum amount of time in ms that transmitted data may remain unacknowledged before TCP will forcefully close the corresponding connection and return ETIMEDOUT to the application. If 0 is given, TCP will continue to use the system default. Increasing the user timeouts allows a TCP connection to survive extended periods without end-to-end connectivity. Decreasing the user timeouts allows applications to "fail fast" if so desired. Otherwise it may take upto 20 minutes with the current system defaults in a normal WAN environment. The socket option can be made during any state of a TCP connection, but is only effective during the synchronized states of a connection (ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK). Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will overtake keepalive to determine when to close a connection due to keepalive failure. The option does not change in anyway when TCP retransmits a packet, nor when a keepalive probe will be sent. This option, like many others, will be inherited by an acceptor from its listener. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv4: EXPORT_SYMBOL cleanupsEric Dumazet2010-07-121-1/+0
| | | | | | | | | CodingStyle cleanups EXPORT_SYMBOL should immediately follow the symbol declaration. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* TCP: avoid to send keepalive probes if receiving dataFlavio Leitner2010-04-271-2/+2
| | | | | | | | | | | | | | | | | | | RFC 1122 says the following: ... Keep-alive packets MUST only be sent when no data or acknowledgement packets have been received for the connection within an interval. ... The acknowledgement packet is reseting the keepalive timer but the data packet isn't. This patch fixes it by checking the timestamp of the last received data packet too when the keepalive timer expires. Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sk_dst_cache RCUificationEric Dumazet2010-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this work. sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock) This rwlock is readlocked for a very small amount of time, and dst entries are already freed after RCU grace period. This calls for RCU again :) This patch converts sk_dst_lock to a spinlock, and use RCU for readers. __sk_dst_get() is supposed to be called with rcu_read_lock() or if socket locked by user, so use appropriate rcu_dereference_check() condition (rcu_read_lock_held() || sock_owned_by_user(sk)) This patch avoids two atomic ops per tx packet on UDP connected sockets, for example, and permits sk_dst_lock to be much less dirtied. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Merge branch 'for-next' into for-linusJiri Kosina2010-03-081-1/+1
|\ | | | | | | | | | | | | | | | | Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c
| * tree-wide: Assorted spelling fixesDaniel Mack2010-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | net: TCP thin linear timeoutsAndreas Petlund2010-02-181-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | This patch will make TCP use only linear timeouts if the stream is thin. This will help to avoid the very high latencies that thin stream suffer because of exponential backoff. This mechanism is only active if enabled by iocontrol or syscontrol and the stream is identified as thin. A maximum of 6 linear timeouts is tried before exponential backoff is resumed. Signed-off-by: Andreas Petlund <apetlund@simula.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: account SYN-ACK timeouts & retransmissionsOctavian Purdila2010-01-171-0/+6
|/ | | | | | | | | | | | | | | Currently we don't increment SYN-ACK timeouts & retransmissions although we do increment the same stats for SYN. We seem to have lost the SYN-ACK accounting with the introduction of tcp_syn_recv_timer (commit 2248761e in the netdev-vger-cvs tree). This patch fixes this issue. In the process we also rename the v4/v6 syn/ack retransmit functions for clarity. We also add a new request_socket operations (syn_ack_timeout) so we can keep code in inet_connection_sock.c protocol agnostic. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Stalling connections: Move timeout calculation routineDamian Lukowski2009-12-081-0/+29
| | | | | | | | | | This patch moves retransmits_timed_out() from include/net/tcp.h to tcp_timer.c, where it is used. Reported-by: Frederic Leroy <fredo@starox.org> Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Fix for dst_negative_adviceKrishna Kumar2009-10-201-2/+2
| | | | | | | | | | | | | dst_negative_advice() should check for changed dst and reset sk_tx_queue_mapping accordingly. Pass sock to the callers of dst_negative_advice. (sk_reset_txq is defined just for use by dst_negative_advice. The only way I could find to get around this is to move dst_negative_() from dst.h to dst.c, include sock.h in dst.c, etc) Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: rename some inet_sock fieldsEric Dumazet2009-10-181-4/+4
| | | | | | | | | | | | | | | | In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert Backoff [v3]: Calculate TCP's connection close threshold as a time value.Damian Lukowski2009-09-011-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC 1122 specifies two threshold values R1 and R2 for connection timeouts, which may represent a number of allowed retransmissions or a timeout value. Currently linux uses sysctl_tcp_retries{1,2} to specify the thresholds in number of allowed retransmissions. For any desired threshold R2 (by means of time) one can specify tcp_retries2 (by means of number of retransmissions) such that TCP will not time out earlier than R2. This is the case, because the RTO schedule follows a fixed pattern, namely exponential backoff. However, the RTO behaviour is not predictable any more if RTO backoffs can be reverted, as it is the case in the draft "Make TCP more Robust to Long Connectivity Disruptions" (http://tools.ietf.org/html/draft-zimmermann-tcp-lcd). In the worst case TCP would time out a connection after 3.2 seconds, if the initial RTO equaled MIN_RTO and each backoff has been reverted. This patch introduces a function retransmits_timed_out(N), which calculates the timeout of a TCP connection, assuming an initial RTO of MIN_RTO and N unsuccessful, exponentially backed-off retransmissions. Whenever timeout decisions are made by comparing the retransmission counter to some value N, this function can be used, instead. The meaning of tcp_retries2 will be changed, as many more RTO retransmissions can occur than the value indicates. However, it yields a timeout which is similar to the one of an unpatched, exponentially backing off TCP in the same scenario. As no application could rely on an RTO greater than MIN_RTO, there should be no risk of a regression. Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert Backoff [v3]: Revert RTO on ICMP destination unreachableDamian Lukowski2009-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Here, an ICMP host/network unreachable message, whose payload fits to TCP's SND.UNA, is taken as an indication that the RTO retransmission has not been lost due to congestion, but because of a route failure somewhere along the path. With true congestion, a router won't trigger such a message and the patched TCP will operate as standard TCP. This patch reverts one RTO backoff, if an ICMP host/network unreachable message, whose payload fits to TCP's SND.UNA, arrives. Based on the new RTO, the retransmission timer is reset to reflect the remaining time, or - if the revert clocked out the timer - a retransmission is sent out immediately. Backoffs are only reverted, if TCP is in RTO loss recovery, i.e. if there have been retransmissions and reversible backoffs, already. Changes from v2: 1) Renaming of skb in tcp_v4_err() moved to another patch. 2) Reintroduced tcp_bound_rto() and __tcp_set_rto(). 3) Fixed code comments. Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>