I got the following oops during interface ifup. Unfortunately its not
easily reproducable so I cant say for sure that my fix fixes this
problem, but I am confident and I think its correct anyway:
<2>kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:234!
<4>illegal operation: 0001 [#1] PREEMPT SMP
<4>Modules linked in:
<4>CPU: 0 Not tainted 2.6.24zlive-guest-07293-gf1ca151-dirty #91
<4>Process swapper (pid: 0, task: 0000000000800938, ksp: 000000000084ddb8)
<4>Krnl PSW : 0404300180000000 0000000000466374 (vring_disable_cb+0x30/0x34)
<4> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
<4>Krnl GPRS: 0000000000000001 0000000000000001 0000000010003800 0000000000466344
<4> 000000000e980900 00000000008848b0 000000000084e748 0000000000000000
<4> 000000000087b300 0000000000001237 0000000000001237 000000000f85bdd8
<4> 000000000e980920 00000000001137c0 0000000000464754 000000000f85bdd8
<4>Krnl Code: 0000000000466368: e3b0b0700004 lg %r11,112(%r11)
<4> 000000000046636e: 07fe bcr 15,%r14
<4> 0000000000466370: a7f40001 brc 15,466372
<4> >0000000000466374: a7f4fff6 brc 15,466360
<4> 0000000000466378: eb7ff0500024 stmg %r7,%r15,80(%r15)
<4> 000000000046637e: a7f13e00 tmll %r15,15872
<4> 0000000000466382: b90400ef lgr %r14,%r15
<4> 0000000000466386: a7840001 brc 8,466388
<4>Call Trace:
<4>([<000201500f85c000>] 0x201500f85c000)
<4> [<0000000000466556>] vring_interrupt+0x72/0x88
<4> [<00000000004801a0>] kvm_extint_handler+0x34/0x44
<4> [<000000000010d22c>] do_extint+0xbc/0xf8
<4> [<0000000000113f98>] ext_no_vtime+0x16/0x1a
<4> [<000000000010a182>] cpu_idle+0x216/0x238
<4>([<000000000010a162>] cpu_idle+0x1f6/0x238)
<4> [<0000000000568656>] rest_init+0xaa/0xb8
<4> [<000000000084ee2c>] start_kernel+0x3fc/0x490
<4> [<0000000000100020>] _stext+0x20/0x80
<4>
<4> <0>Kernel panic - not syncing: Fatal exception in interrupt
<4>
After looking at the code and the dump I think the following scenario
happened: Ifup was running on cpu2 and the interrupt arrived on cpu0.
Now virtnet_open on cpu 2 managed to execute napi_enable and disable_cb
but did not execute rx_schedule. Meanwhile on cpu 0 skb_recv_done was
called by vring_interrupt, executed netif_rx_schedule_prep, which
succeeded and therefore called disable_cb. This triggered the BUG_ON,
as interrupts were already disabled by cpu 2.
I think the proper solution is to make the call to disable_cb depend on
the atomic update of NAPI_STATE_SCHED by using netif_rx_schedule_prep
in the same way as skb_recv_done.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This fixes a potential dangling xmit problem.
We also suppress refill interrupts until we need them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fix bug found by Christian Borntraeger: if the other side fills all
the registered network buffers before we enable NAPI, we will never
get an interrupt. The simplest fix is to process the input queue once
on open.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Hello Rusty,
virtnet_probe already calls alloc_etherdev, which calls ether_setup.
There is no need to do that again.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A reset function solves three problems:
1) It allows us to renegotiate features, eg. if we want to upgrade a
guest driver without rebooting the guest.
2) It gives us a clean way of shutting down virtqueues: after a reset,
we know that the buffers won't be used by the host, and
3) It helps the guest recover from messed-up drivers.
So we remove the ->shutdown hook, and the only way we now remove
feature bits is via reset.
We leave it to the driver to do the reset before it deletes queues:
the balloon driver, for example, needs to chat to the host in its
remove function.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we want to reset the device to remove them, this is simpler
(device is reset for us on driver remove).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1) Turn GSO on virtio net into an all-or-nothing (keep checksumming
separate). Having multiple bits is a pain: if you can't support something
you should handle it in software, which is still a performance win.
2) Make VIRTIO_NET_HDR_GSO_ECN a flag in the header, so it can apply to
IPv6 or v4.
3) Rename VIRTIO_NET_F_NO_CSUM to VIRTIO_NET_F_CSUM (ie. means we do
checksumming).
4) Add csum and gso params to virtio_net to allow more testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's far easier to deal with packets if we don't have to parse the
packet to figure out the header length to know how much to pull into
the skb data. Add the field to the virtio_net_hdr struct (and fix the
spaces that somehow crept in there).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It seems that virtio_net wants to disable callbacks (interrupts) before
calling netif_rx_schedule(), so we can't use the return value to do so.
Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
now returns void, rather than a boolean.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously we used a type/len pair within the config space, but this
seems overkill. We now simply define a structure which represents the
layout in the config space: the config space can now only be extended
at the end.
The main driver-visible changes:
1) We indicate what fields are present with an explicit feature bit.
2) Virtqueues are explicitly numbered, and not in the config space.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use it in virtio_net (replacing buggy version there), it's also going
to be used by TAP for partial csum support.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
The virtio code never hooked through the ->remove callback. Although
noone supports device removal at the moment, this code is already
needed for module unloading.
This of course also revealed bugs in virtio_blk, virtio_net and lguest
unloading paths.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The network driver uses two virtqueues: one for input packets and one
for output packets. This has nice locking properties (ie. we don't do
any for recv vs send).
TODO:
1) Big packets.
2) Multi-client devices (maybe separate driver?).
3) Resolve freeing of old xmit skbs (Christian Borntraeger)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org