created | 2020-11-19T18:33:56Z |
---|---|
begin | 2020-06-16T00:00:00Z |
end | 2020-06-17T00:00:00Z |
path | src/sys |
commits | 14 |
date | 2020-06-16T02:23:40Z | |||
---|---|---|---|---|
author | jmatthew | |||
files | src/sys/dev/pci/if_mcx.c | log | diff | annotate |
message |
Use separate event queues and interrupt vectors for admin/link events and tx/rx completions. This is another step towards using multiple queues. Each queue will have its own completion queue, event queue and UAR, which keeps everything simple and should avoid the need for any locks. ok dlg@ |
date | 2020-06-16T04:46:49Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/net/toeplitz.c | log | diff | annotate |
src/sys/net/toeplitz.h | log | diff | annotate | |
message |
Add a symmetric toeplitz implementation, with integration for nics. This is another bit of the puzzle for supporting multiple rx rings and receive side scaling (RSS) on nics. It borrows heavily from DragonflyBSD, but I've made some tweaks on the way. The interesting bits that dfly came up with was a way to use Toeplitz hashing so the kernel AND network interfaces hash packets so packets in both directions onto the same bucket. The other interesting thing is that the optimised the hash calculation by building a cache of all the intermediate results possible for each input byte. Their hash calculation is simply xoring these intermediate reults together. So this diff adds an API for the kernel to use for calculating a hash for ip addresses and ports, and adds a function for network drivers to call that gives them a key to use with RSS. If all drivers use the same key, then the same flows should be steered to the same place when they enter the network stack regardless of which hardware they came in on. The changes I made relative to dfly are around reducing the size of the caches. DragonflyBSD builds a cache of 32bit values, but because of how the Toeplitz key is constructed, the 32bits are made up of a repeated 16bit pattern. We can just store the 16 bits and reconstruct the 32 bits if we want. Both us and dragonfly only use 15 or 16 bits of the result anyway, so 32bits is unecessary. Secondly, the dfly implementation keeps a cache of values for the high and low bytes of input, but the values in the two caches are almost the same. You can byteswap the values in one of the byte caches to get the values for the other, but you can also just byteswap values at runtime to get the same value, which is what this implementation does. The result of both these changes is that the byte cache is a quarter of the size of the one in dragonflybsd. tb@ has had a close look at this and found a bunch of other optimisations that can be implemented, and because he's a wizard^Wmathematician he has proofs (and also did tests). ok tb@ jmatthew@ |
date | 2020-06-16T05:09:28Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/conf/files | log | diff | annotate |
message | wire stoeplitz code into the tree. |
date | 2020-06-16T05:09:29Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/kern/init_main.c | log | diff | annotate |
message | wire stoeplitz code into the tree. |
date | 2020-06-16T05:24:07Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/dev/pci/if_vmxreg.h | log | diff | annotate |
message |
show the structure for the rss configuration. i can't remember writing this, i think jmatthew@ did it, but im sniping the commit. sorry jmatthew@ |
date | 2020-06-16T05:31:15Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/dev/pci/files.pci | log | diff | annotate |
src/sys/dev/pci/if_vmx.c | log | diff | annotate | |
message |
configure toeplitz using the kernel stoeplitz key if needed. "if needed" basically means if more than 1 queue is set up, then set up rss. again, i think jmatthew@ wrote most of this, but im sniping it cos of the stoeplitz integration. |
date | 2020-06-16T09:41:21Z | |||
---|---|---|---|---|
author | jmatthew | |||
files | src/sys/dev/usb/if_atu.c | log | diff | annotate |
message |
Release the rx node if we were unable to allocate a new rx buffer. The node here is always ic_bss, for which the reference count isn't actually used (it's always freed when the interface detaches), so not releasing it in this case wasn't really a problem. ok stsp@ |
date | 2020-06-16T14:04:50Z | |||
---|---|---|---|---|
author | jsg | |||
files | src/sys/dev/pci/drm/include/linux/atomic.h | log | diff | annotate |
message | remove a dead store |
date | 2020-06-16T14:35:12Z | |||
---|---|---|---|---|
author | jsg | |||
files | src/sys/dev/pci/drm/include/linux/atomic.h | log | diff | annotate |
message | implement atomic_inc_not_zero() by way of atomic_add_unless() |
date | 2020-06-16T15:10:03Z | |||
---|---|---|---|---|
author | jsg | |||
files | src/sys/dev/pci/drm/include/linux/atomic.h | log | diff | annotate |
message | remove some unused defines |
date | 2020-06-16T17:38:12Z | |||
---|---|---|---|---|
author | kettenis | |||
files | src/sys/arch/powerpc64/conf/Makefile.powerpc64 | log | diff | annotate |
message | Add missing dependeny. |
date | 2020-06-16T18:09:27Z | |||
---|---|---|---|---|
author | kettenis | |||
files | src/sys/arch/powerpc64/powerpc64/pmap.c | log | diff | annotate |
message | Some simplifications. |
date | 2020-06-16T21:49:30Z | |||
---|---|---|---|---|
author | mortimer | |||
files | src/sys/dev/rasops/rasops.c | log | diff | annotate |
message |
Remove old commented out line and fix indent. clang-10 complains about the misleading indentation. ok patrick@ |
date | 2020-06-16T23:35:10Z | |||
---|---|---|---|---|
author | dlg | |||
files | src/sys/arch/amd64/amd64/intr.c | log | diff | annotate |
message |
make intr_barrier run sched_barrier on the cpu the interrupt pinned to. intr_barrier passed NULL to sched_barrier before this, which ends up being the primary cpu. that's been mostly right until this point, but is set to change. |