Discussion:
Wifi-related kernel-oops on mt7621 after 4.14 update
(too old to reply)
Kristian Evensen
2018-04-12 10:42:53 UTC
Permalink
Hello,

I have recently updated some ramips mt7621-devices (ZBT WG3526) to the
latest nightly. Almost everything seems to work fine, but using either
wifi interface in client mode seems triggers an oops. I see two
different oops-messages:

Message 1:
[ 66.442802] CPU 1 Unable to handle kernel paging request at virtual
address e9e9e0d5, epc == 8f3e060c, ra == 8ec86fac
[ 66.453460] Oops[#1]:
[ 66.455743] CPU: 1 PID: 3679 Comm: wifib Tainted: G W 4.14.32 #0
[ 66.462857] task: 8e223200 task.stack: 8e1b4000
[ 66.467374] $ 0 : 00000000 00000001 7abc2e80 00000020
[ 66.472612] $ 4 : 8ec48bc0 8e76dc20 e9e9dae0 8e1b5848
[ 66.477847] $ 8 : 8ec4902c 80452968 00ee4000 ffffff80
[ 66.483061] $12 : 80583f8c 00000040 00000000 77f0f3c0
[ 66.488276] $16 : 8ec49560 8f578000 8e76d480 8ec48bc0
[ 66.493493] $20 : 00000000 00000002 8e1b5cb8 00000008
[ 66.498711] $24 : 00000000 77e74ff0
[ 66.503937] $28 : 8e1b4000 8e1b5780 00000000 8ec86fac
[ 66.509153] Hi : 00000000
[ 66.512020] Lo : 00000068
[ 66.514913] epc : 8f3e060c 0x8f3e060c
[ 66.518866] ra : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[ 66.524843] Status: 11007c03 KERNEL EXL IE
[ 66.529015] Cause : 40800008 (ExcCode 02)
[ 66.533005] BadVA : e9e9e0d5
[ 66.535869] PrId : 0001992f (MIPS 1004Kc)
[ 66.539941] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 66.610889] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 66.681822] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 66.753184] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 66.763357] Process wifib (pid: 3679, threadinfo=8e1b4000,
task=8e223200, tls=77f10ec0)
[ 66.771321] Stack : 00000000 00000000 00000000 00000000 00000000
00000000 8e1b5848 8f578000
[ 66.779654] 8e76d480 8ec48bc0 8f578130 00000002 8e1b5cb8
00000008 00000000 8ec86fac
[ 66.787987] 01000000 8e134628 00000007 8e1b5b98 8e134628
00000000 8e1b5b90 8ec49014
[ 66.796325] 8e76d000 00000000 fffffffe 00000002 8e1b5cb8
8ec9e338 8ec315ac 00000000
[ 66.804661] 000001d2 80580000 00000000 00000000 00000000
8e134628 8e068840 8ec1fb28
[ 66.812996] ...
[ 66.815446] Call Trace:
[ 66.817894] [<8f3e060c>] 0x8f3e060c
[ 66.821370] Code: 000630c0 02063021 94f40002 <90d205f5> 00e0b025
16800002 3253ffff 2414001f 96d50004
[ 66.831098]
[ 66.833187] ---[ end trace 8c8a003de3eabcd8 ]---
[ 66.841897] Kernel panic - not syncing: Fatal exception
[ 66.849317] Rebooting in 3 seconds..

Message 2:
[ 132.613293] CPU 0 Unable to handle kernel paging request at virtual
address ea9160d5, epc == 8f2c060c, ra == 8ec86fac
[ 132.623927] Oops[#1]:
[ 132.626199] CPU: 0 PID: 41 Comm: kworker/u8:3 Tainted: G W
4.14.32 #0
[ 132.633882] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[ 132.639431] task: 8fd48c80 task.stack: 8fd94000
[ 132.643933] $ 0 : 00000000 00000001 7ac52e80 00000020
[ 132.649141] $ 4 : 8f2d0bc0 8e04dc20 ea915ae0 8f122400
[ 132.654350] $ 8 : 00000000 80452970 8fc02b00 0005376b
[ 132.659558] $12 : 000012d8 00000000 ffffffff 0000001c
[ 132.664766] $16 : 8f2d1560 8f58a000 8e04d480 8f2d0bc0
[ 132.669973] $20 : 00000000 00000001 8f2d1014 00000000
[ 132.675181] $24 : 3b9aca00 00000000
[ 132.680390] $28 : 8fd94000 8fd95c88 8ece1618 8ec86fac
[ 132.685605] Hi : 000007d0
[ 132.688473] Lo : 00000bb8
[ 132.691357] epc : 8f2c060c 0x8f2c060c
[ 132.695235] ra : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[ 132.701212] Status: 11008403 KERNEL EXL IE
[ 132.705391] Cause : 40800008 (ExcCode 02)
[ 132.709380] BadVA : ea9160d5
[ 132.712247] PrId : 0001992f (MIPS 1004Kc)
[ 132.716320] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 132.787381] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 132.858369] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 132.929808] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 132.939989] Process kworker/u8:3 (pid: 41, threadinfo=8fd94000,
task=8fd48c80, tls=00000000)
[ 132.948385] Stack : 00000001 8f2c08fc 8f2d1330 8e04d480 8f2d0bc0
8e04d480 8f122400 8f58a000
[ 132.956736] 8e04d480 8f2d0bc0 8f58a130 00000001 8f2d1014
00000000 8ece1618 8ec86fac
[ 132.965084] 00000002 8f58a000 00000001 8ec86df4 8f58a000
8f2d0bc0 8f58a000 8f122400
[ 132.973434] 8e04d480 8e04d480 8fd95d38 00000001 8f2d1014
8ec87a10 00000000 8007be44
[ 132.981784] 00000000 00000000 00000000 8fd95d10 8fd95d30
8f2d102c 8f2d102c 8ec87de8
[ 132.990130] ...
[ 132.992578] Call Trace:
[ 132.995025] [<8f2c060c>] 0x8f2c060c
[ 132.998506] Code: 000630c0 02063021 94f40002 <90d205f5> 00e0b025
16800002 3253ffff 2414001f 96d50004
[ 133.008251]
[ 133.011063] ---[ end trace 43bd4ffe21fcd0aa ]---
[ 133.019992] Kernel panic - not syncing: Fatal exception
[ 133.027692] Rebooting in 3 seconds..

The WG3526 uses mt7603 for 2.4GHz and mt7612 for 5GHz, and the error
happens with either. Using the interfaces as APs works fine (at least
in my tests), and using the interfaces as clients works fine with
kernel 4.9.

Thanks in advance for any help,
Kristian
John Crispin
2018-04-12 11:02:22 UTC
Permalink
Post by Kristian Evensen
Hello,
I have recently updated some ramips mt7621-devices (ZBT WG3526) to the
latest nightly. Almost everything seems to work fine, but using either
wifi interface in client mode seems triggers an oops. I see two
[ 66.442802] CPU 1 Unable to handle kernel paging request at virtual
address e9e9e0d5, epc == 8f3e060c, ra == 8ec86fac
[ 66.455743] CPU: 1 PID: 3679 Comm: wifib Tainted: G W 4.14.32 #0
[ 66.462857] task: 8e223200 task.stack: 8e1b4000
[ 66.467374] $ 0 : 00000000 00000001 7abc2e80 00000020
[ 66.472612] $ 4 : 8ec48bc0 8e76dc20 e9e9dae0 8e1b5848
[ 66.477847] $ 8 : 8ec4902c 80452968 00ee4000 ffffff80
[ 66.483061] $12 : 80583f8c 00000040 00000000 77f0f3c0
[ 66.488276] $16 : 8ec49560 8f578000 8e76d480 8ec48bc0
[ 66.493493] $20 : 00000000 00000002 8e1b5cb8 00000008
[ 66.498711] $24 : 00000000 77e74ff0
[ 66.503937] $28 : 8e1b4000 8e1b5780 00000000 8ec86fac
[ 66.509153] Hi : 00000000
[ 66.512020] Lo : 00000068
[ 66.514913] epc : 8f3e060c 0x8f3e060c
[ 66.518866] ra : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[ 66.524843] Status: 11007c03 KERNEL EXL IE
[ 66.529015] Cause : 40800008 (ExcCode 02)
[ 66.533005] BadVA : e9e9e0d5
[ 66.535869] PrId : 0001992f (MIPS 1004Kc)
[ 66.539941] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 66.610889] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 66.681822] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 66.753184] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 66.763357] Process wifib (pid: 3679, threadinfo=8e1b4000,
task=8e223200, tls=77f10ec0)
[ 66.771321] Stack : 00000000 00000000 00000000 00000000 00000000
00000000 8e1b5848 8f578000
[ 66.779654] 8e76d480 8ec48bc0 8f578130 00000002 8e1b5cb8
00000008 00000000 8ec86fac
[ 66.787987] 01000000 8e134628 00000007 8e1b5b98 8e134628
00000000 8e1b5b90 8ec49014
[ 66.796325] 8e76d000 00000000 fffffffe 00000002 8e1b5cb8
8ec9e338 8ec315ac 00000000
[ 66.804661] 000001d2 80580000 00000000 00000000 00000000
8e134628 8e068840 8ec1fb28
[ 66.812996] ...
[ 66.817894] [<8f3e060c>] 0x8f3e060c
[ 66.821370] Code: 000630c0 02063021 94f40002 <90d205f5> 00e0b025
16800002 3253ffff 2414001f 96d50004
[ 66.831098]
[ 66.833187] ---[ end trace 8c8a003de3eabcd8 ]---
[ 66.841897] Kernel panic - not syncing: Fatal exception
[ 66.849317] Rebooting in 3 seconds..
[ 132.613293] CPU 0 Unable to handle kernel paging request at virtual
address ea9160d5, epc == 8f2c060c, ra == 8ec86fac
[ 132.626199] CPU: 0 PID: 41 Comm: kworker/u8:3 Tainted: G W
4.14.32 #0
[ 132.633882] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[ 132.639431] task: 8fd48c80 task.stack: 8fd94000
[ 132.643933] $ 0 : 00000000 00000001 7ac52e80 00000020
[ 132.649141] $ 4 : 8f2d0bc0 8e04dc20 ea915ae0 8f122400
[ 132.654350] $ 8 : 00000000 80452970 8fc02b00 0005376b
[ 132.659558] $12 : 000012d8 00000000 ffffffff 0000001c
[ 132.664766] $16 : 8f2d1560 8f58a000 8e04d480 8f2d0bc0
[ 132.669973] $20 : 00000000 00000001 8f2d1014 00000000
[ 132.675181] $24 : 3b9aca00 00000000
[ 132.680390] $28 : 8fd94000 8fd95c88 8ece1618 8ec86fac
[ 132.685605] Hi : 000007d0
[ 132.688473] Lo : 00000bb8
[ 132.691357] epc : 8f2c060c 0x8f2c060c
[ 132.695235] ra : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[ 132.701212] Status: 11008403 KERNEL EXL IE
[ 132.705391] Cause : 40800008 (ExcCode 02)
[ 132.709380] BadVA : ea9160d5
[ 132.712247] PrId : 0001992f (MIPS 1004Kc)
[ 132.716320] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 132.787381] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 132.858369] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 132.929808] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 132.939989] Process kworker/u8:3 (pid: 41, threadinfo=8fd94000,
task=8fd48c80, tls=00000000)
[ 132.948385] Stack : 00000001 8f2c08fc 8f2d1330 8e04d480 8f2d0bc0
8e04d480 8f122400 8f58a000
[ 132.956736] 8e04d480 8f2d0bc0 8f58a130 00000001 8f2d1014
00000000 8ece1618 8ec86fac
[ 132.965084] 00000002 8f58a000 00000001 8ec86df4 8f58a000
8f2d0bc0 8f58a000 8f122400
[ 132.973434] 8e04d480 8e04d480 8fd95d38 00000001 8f2d1014
8ec87a10 00000000 8007be44
[ 132.981784] 00000000 00000000 00000000 8fd95d10 8fd95d30
8f2d102c 8f2d102c 8ec87de8
[ 132.990130] ...
[ 132.995025] [<8f2c060c>] 0x8f2c060c
[ 132.998506] Code: 000630c0 02063021 94f40002 <90d205f5> 00e0b025
16800002 3253ffff 2414001f 96d50004
[ 133.008251]
[ 133.011063] ---[ end trace 43bd4ffe21fcd0aa ]---
[ 133.019992] Kernel panic - not syncing: Fatal exception
[ 133.027692] Rebooting in 3 seconds..
The WG3526 uses mt7603 for 2.4GHz and mt7612 for 5GHz, and the error
happens with either. Using the interfaces as APs works fine (at least
in my tests), and using the interfaces as clients works fine with
kernel 4.9.
try enabling KALLSYMS to get a verbose stack trace.
    John
Post by Kristian Evensen
Thanks in advance for any help,
Kristian
_______________________________________________
openwrt-devel mailing list
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
Kristian Evensen
2018-04-12 13:28:48 UTC
Permalink
Hi,
Post by John Crispin
try enabling KALLSYMS to get a verbose stack trace.
Thanks for the pointer. I compiled a new image KALLSYMS, but now I am
not able to reproduce the error. Perhaps there was something dirty in
my build directory. I will keep the image KALLSYMS on the routers and
keep checking for the error.

BR,
Kristian
Kristian Evensen
2018-04-17 11:50:58 UTC
Permalink
Hi,

On Thu, Apr 12, 2018 at 3:28 PM, Kristian Evensen
Post by Kristian Evensen
Thanks for the pointer. I compiled a new image KALLSYMS, but now I am
not able to reproduce the error. Perhaps there was something dirty in
my build directory. I will keep the image KALLSYMS on the routers and
keep checking for the error.
The error came back after I updated my router again. Here are the
oops'es with KALLMSYS enabled:

[ 36.714334] CPU 1 Unable to handle kernel paging request at virtual
address f32f0c10, epc == 8f391304, ra == 8f391304
[ 36.724966] Oops[#1]:
[ 36.727246] CPU: 1 PID: 33 Comm: kworker/u8:2 Tainted: G W
4.14.32 #0
[ 36.734949] Workqueue: phy1 ieee80211_ibss_leave [mac80211]
[ 36.740523] task: 8fd48000 task.stack: 8fd36000
[ 36.745037] $ 0 : 00000000 00000001 0000000e 00000001
[ 36.750270] $ 4 : 8f37957c 00000000 00000000 00000000
[ 36.755506] $ 8 : 00000000 80452970 00000001 00122121
[ 36.760726] $12 : 00000000 00000000 00000010 77ec6230
[ 36.765946] $16 : 8f37957c 8fd37d58 f32f0c10 94573690
[ 36.771173] $20 : 00000001 00000040 000000ff 8f378bc0
[ 36.776394] $24 : 000010d9 8f391218
[ 36.781615] $28 : 8fd36000 8fd37d10 00000000 8f391304
[ 36.786837] Hi : 0000969d
[ 36.789701] Lo : 00000110
[ 36.792603] epc : 8f391304 mt76_get_survey+0xec/0x31c [mt76]
[ 36.798417] ra : 8f391304 mt76_get_survey+0xec/0x31c [mt76]
[ 36.804220] Status: 11007c03 KERNEL EXL IE
[ 36.808399] Cause : 40800008 (ExcCode 02)
[ 36.812389] BadVA : f32f0c10
[ 36.815257] PrId : 0001992f (MIPS 1004Kc)
[ 36.819331] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 36.890390] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 36.961381] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 37.032821] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 37.043006] Process kworker/u8:2 (pid: 33, threadinfo=8fd36000,
task=8fd48000, tls=00000000)
[ 37.051404] Stack : 00000001 80051a40 81494dc0 00000000 00000400
8f57c000 00000000 00000000
[ 37.059753] 00000000 8ec110dc 002f113b 00000000 8fc2a500
81494dc0 00000000 00000001
[ 37.068100] 814a2dc0 8fc2a614 94573690 00000000 00000000
00000000 00000000 00000000
[ 37.076449] 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[ 37.084797] 0000000c 00000000 00000000 8ec113a0 00000020
8149d2ec 814a2dc0 8fc2a500
[ 37.093147] ...
[ 37.095597] Call Trace:
[ 37.098045] [<8f391304>] mt76_get_survey+0xec/0x31c [mt76]
[ 37.103671] [<8ec110dc>]
___ieee80211_start_rx_ba_session+0x15c/0x39c [mac80211]
[ 37.111127] [<8ec113a0>] __ieee80211_start_rx_ba_session+0x84/0xb8 [mac80211]
[ 37.118315] [<8ec1144c>] ieee80211_process_addba_request+0x78/0x8c [mac80211]
[ 37.125507] [<8ec152a0>] ieee80211_ibss_leave+0x44c/0x19c8 [mac80211]
[ 37.132067] Code: 2610001c 0c116236 02002025 <8e440000> 3c058d4f
34a5df3b 00850019 00003012 00003810
[ 37.141817]
[ 37.143582] ---[ end trace 5af5293c693da408 ]---
[ 37.151753] Kernel panic - not syncing: Fatal exception in interrupt
[ 37.160354] Rebooting in 3 seconds..

[ 30.252516] CPU 0 Unable to handle kernel paging request at virtual
address eb44a0d5, epc == 8ed40ba4, ra == 8ec86fac
[ 30.263189] Oops[#1]:
[ 30.265506] CPU: 0 PID: 33 Comm: kworker/u8:2 Tainted: G W
4.14.32 #0
[ 30.273244] Workqueue: phy1 ieee80211_ibss_leave [mac80211]
[ 30.278811] task: 8fd48000 task.stack: 8fd36000
[ 30.283321] $ 0 : 00000000 00000001 7adc6e80 00000000
[ 30.288546] $ 4 : 8f3d8bc0 8fd27c20 eb449ae0 8e03a800
[ 30.293766] $ 8 : 00000000 80452970 00000007 0006edf8
[ 30.298985] $12 : 00000000 8ee8d0c0 00000007 1dcd6501
[ 30.304205] $16 : 8f3d9560 8f5b8800 8fd27480 8f3d8bc0
[ 30.309425] $20 : 8e03a800 00000000 80560000 fffffffe
[ 30.314645] $24 : 00000000 00000000
[ 30.319865] $28 : 8fd36000 8fd37cf8 80560000 8ec86fac
[ 30.325085] Hi : 0000329d
[ 30.327951] Lo : 0000010e
[ 30.330849] epc : 8ed40ba4 mt76x2_dma_cleanup+0x478/0x1128 [mt76x2e]
[ 30.337408] ra : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[ 30.343386] Status: 11008403 KERNEL EXL IE
[ 30.347565] Cause : c0800008 (ExcCode 02)
[ 30.351553] BadVA : eb44a0d5
[ 30.354421] PrId : 0001992f (MIPS 1004Kc)
[ 30.358494] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[ 30.429553] nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[ 30.500542] ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[ 30.571979] ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[ 30.582158] Process kworker/u8:2 (pid: 33, threadinfo=8fd36000,
task=8fd48000, tls=00000000)
[ 30.590554] Stack : 8fd37d78 80081210 00000001 00000000 8e03a800
8f5b8800 8fd27480 8f3d8bc0
[ 30.598904] 8f5b8930 00000000 80560000 8ec86fac 00000000
8f5b8800 00000001 8ec86df4
[ 30.607252] 8f5b8800 8f3d8bc0 8f5b8800 8e03a800 8fd27480
8fd274ac 8f3d8bc0 00000000
[ 30.615601] 80560000 8ec87a10 8ee8dc00 8007be44 00000000
00000000 8fd37d70 8fd37d70
[ 30.623950] 00000000 8f5b8800 00000000 8ec87ac0 8f3d8bc0
8ec8574c 8f3d8bc0 00000000
[ 30.632297] ...
[ 30.634745] Call Trace:
[ 30.637194] [<8ed40ba4>] mt76x2_dma_cleanup+0x478/0x1128 [mt76x2e]
[ 30.643445] Code: 02063021 94e30002 00e0a025 <1060001a> 90d105f5
00031400 3c04ff00 00442024 14800003

This is with the same image as last time (commit
f6e6eadc99c6274207f8f2ebc739063549959a1f) and configuration (radios
used as clients). I see that mt76 has been updated during the weekend
so I will go ahead and compile a new image with the latest updates.

BR,
Kristian
Felix Fietkau
2018-04-17 12:56:39 UTC
Permalink
Post by Kristian Evensen
This is with the same image as last time (commit
f6e6eadc99c6274207f8f2ebc739063549959a1f) and configuration (radios
used as clients). I see that mt76 has been updated during the weekend
so I will go ahead and compile a new image with the latest updates.
I'm about to push another update in a minute. Please wait for that and
test it. I fixed some more issues in the code.

- Felix
Kristian Evensen
2018-04-17 13:34:09 UTC
Permalink
Post by Felix Fietkau
Post by Kristian Evensen
This is with the same image as last time (commit
f6e6eadc99c6274207f8f2ebc739063549959a1f) and configuration (radios
used as clients). I see that mt76 has been updated during the weekend
so I will go ahead and compile a new image with the latest updates.
I'm about to push another update in a minute. Please wait for that and
test it. I fixed some more issues in the code.
Thanks, great. I just started building a new image for my router, will
test and let you know if I still see the issue.

BR,
Kristian
Kristian Evensen
2018-04-18 09:34:10 UTC
Permalink
Hi,

On Tue, Apr 17, 2018 at 3:34 PM, Kristian Evensen
Post by Kristian Evensen
Thanks, great. I just started building a new image for my router, will
test and let you know if I still see the issue.
I think I have finished my testing, at least for now, and it seems the
problem is fixed. I compiled an image with the latest changes to mt76,
installed the image on one of my WG3526-routers showing the issue,
configured both radios as clients and updated the router ~10 times,
rebooted, etc. I did not see the crash, wifi was rock solid. I then
"updated" to the older image without the latest changes and the oops
appeared right away.

I will keep an eye on this router, just in case, but it seems the
problem is gone. Thanks for fixing it so fast!

BR,
Kristian

TheWerthFam
2018-04-13 00:42:52 UTC
Permalink
Post by John Crispin
Post by Kristian Evensen
Hello,
I have recently updated some ramips mt7621-devices (ZBT WG3526) to the
latest nightly. Almost everything seems to work fine, but using either
wifi interface in client mode seems triggers an oops. I see two
[   66.442802] CPU 1 Unable to handle kernel paging request at virtual
address e9e9e0d5, epc == 8f3e060c, ra == 8ec86fac
[   66.455743] CPU: 1 PID: 3679 Comm: wifib Tainted: G W      
4.14.32 #0
[   66.462857] task: 8e223200 task.stack: 8e1b4000
[   66.467374] $ 0   : 00000000 00000001 7abc2e80 00000020
[   66.472612] $ 4   : 8ec48bc0 8e76dc20 e9e9dae0 8e1b5848
[   66.477847] $ 8   : 8ec4902c 80452968 00ee4000 ffffff80
[   66.483061] $12   : 80583f8c 00000040 00000000 77f0f3c0
[   66.488276] $16   : 8ec49560 8f578000 8e76d480 8ec48bc0
[   66.493493] $20   : 00000000 00000002 8e1b5cb8 00000008
[   66.498711] $24   : 00000000 77e74ff0
[   66.503937] $28   : 8e1b4000 8e1b5780 00000000 8ec86fac
[   66.509153] Hi    : 00000000
[   66.512020] Lo    : 00000068
[   66.514913] epc   : 8f3e060c 0x8f3e060c
[   66.518866] ra    : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[   66.524843] Status: 11007c03 KERNEL EXL IE
[   66.529015] Cause : 40800008 (ExcCode 02)
[   66.533005] BadVA : e9e9e0d5
[   66.535869] PrId  : 0001992f (MIPS 1004Kc)
[   66.539941] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[   66.610889]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[   66.681822]  ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[   66.753184]  ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[   66.763357] Process wifib (pid: 3679, threadinfo=8e1b4000,
task=8e223200, tls=77f10ec0)
[   66.771321] Stack : 00000000 00000000 00000000 00000000 00000000
00000000 8e1b5848 8f578000
[   66.779654]         8e76d480 8ec48bc0 8f578130 00000002 8e1b5cb8
00000008 00000000 8ec86fac
[   66.787987]         01000000 8e134628 00000007 8e1b5b98 8e134628
00000000 8e1b5b90 8ec49014
[   66.796325]         8e76d000 00000000 fffffffe 00000002 8e1b5cb8
8ec9e338 8ec315ac 00000000
[   66.804661]         000001d2 80580000 00000000 00000000 00000000
8e134628 8e068840 8ec1fb28
[   66.812996]         ...
[   66.817894] [<8f3e060c>] 0x8f3e060c
[   66.821370] Code: 000630c0  02063021  94f40002 <90d205f5> 00e0b025
16800002  3253ffff  2414001f  96d50004
[   66.831098]
[   66.833187] ---[ end trace 8c8a003de3eabcd8 ]---
[   66.841897] Kernel panic - not syncing: Fatal exception
[   66.849317] Rebooting in 3 seconds..
[  132.613293] CPU 0 Unable to handle kernel paging request at virtual
address ea9160d5, epc == 8f2c060c, ra == 8ec86fac
[  132.626199] CPU: 0 PID: 41 Comm: kworker/u8:3 Tainted: G        W
     4.14.32 #0
[  132.633882] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[  132.639431] task: 8fd48c80 task.stack: 8fd94000
[  132.643933] $ 0   : 00000000 00000001 7ac52e80 00000020
[  132.649141] $ 4   : 8f2d0bc0 8e04dc20 ea915ae0 8f122400
[  132.654350] $ 8   : 00000000 80452970 8fc02b00 0005376b
[  132.659558] $12   : 000012d8 00000000 ffffffff 0000001c
[  132.664766] $16   : 8f2d1560 8f58a000 8e04d480 8f2d0bc0
[  132.669973] $20   : 00000000 00000001 8f2d1014 00000000
[  132.675181] $24   : 3b9aca00 00000000
[  132.680390] $28   : 8fd94000 8fd95c88 8ece1618 8ec86fac
[  132.685605] Hi    : 000007d0
[  132.688473] Lo    : 00000bb8
[  132.691357] epc   : 8f2c060c 0x8f2c060c
[  132.695235] ra    : 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[  132.701212] Status: 11008403 KERNEL EXL IE
[  132.705391] Cause : 40800008 (ExcCode 02)
[  132.709380] BadVA : ea9160d5
[  132.712247] PrId  : 0001992f (MIPS 1004Kc)
[  132.716320] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[  132.787381]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[  132.858369]  ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[  132.929808]  ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[  132.939989] Process kworker/u8:3 (pid: 41, threadinfo=8fd94000,
task=8fd48c80, tls=00000000)
[  132.948385] Stack : 00000001 8f2c08fc 8f2d1330 8e04d480 8f2d0bc0
8e04d480 8f122400 8f58a000
[  132.956736]         8e04d480 8f2d0bc0 8f58a130 00000001 8f2d1014
00000000 8ece1618 8ec86fac
[  132.965084]         00000002 8f58a000 00000001 8ec86df4 8f58a000
8f2d0bc0 8f58a000 8f122400
[  132.973434]         8e04d480 8e04d480 8fd95d38 00000001 8f2d1014
8ec87a10 00000000 8007be44
[  132.981784]         00000000 00000000 00000000 8fd95d10 8fd95d30
8f2d102c 8f2d102c 8ec87de8
[  132.990130]         ...
[  132.995025] [<8f2c060c>] 0x8f2c060c
[  132.998506] Code: 000630c0  02063021  94f40002 <90d205f5> 00e0b025
16800002  3253ffff  2414001f  96d50004
[  133.008251]
[  133.011063] ---[ end trace 43bd4ffe21fcd0aa ]---
[  133.019992] Kernel panic - not syncing: Fatal exception
[  133.027692] Rebooting in 3 seconds..
The WG3526 uses mt7603 for 2.4GHz and mt7612 for 5GHz, and the error
happens with either. Using the interfaces as APs works fine (at least
in my tests), and using the interfaces as clients works fine with
kernel 4.9.
I saw similar crash on my dual core sunx1 system, with 4.14.
Seems to happen when the CPU is doing more than routing traffic.

[45729.124237] Unable to handle kernel NULL pointer deref8

[45729.132661] pgd = edc4ad00

[45729.135502] [00000028] *pgd=6e733003, *pmd=7fc26003

[45729.140895] Internal error: Oops: 207 [#1] PREEMPT SMP ARM

[45729.146387] Modules linked in: rt2800usb rt2800lib rt2x00usb
rt2x00lib pppoet

[45729.217382]  ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink
ip6t_REJEm

[45729.235828] CPU: 0 PID: 4028 Comm: e2guardian Not tainted 4.14.25 #0

[45729.242172] Hardware name: Allwinner sun7i (A20) Family

[45729.247391] task: edf71500 task.stack: edf7c000

[45729.251928] PC is at tcp_push+0x44/0xfc

[45729.255761] LR is at 0xed34eb34

[45729.258899] pc : [<c06285d4>]    lr : [<ed34eb34>] psr: 40000013

[45729.265156] sp : edf7de00  ip : ed416780 fp : ed34eb34
Post by John Crispin
try enabling KALLSYMS to get a verbose stack trace.
    John
Post by Kristian Evensen
Thanks in advance for any help,
Kristian
_______________________________________________
openwrt-devel mailing list
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
_______________________________________________
Lede-dev mailing list
http://lists.infradead.org/mailman/listinfo/lede-dev
Fushan Wen
2018-04-13 01:10:46 UTC
Permalink
Post by TheWerthFam
[45729.251928] PC is at tcp_push+0x44/0xfc
This should be fixed in kernel 4.14.32. Try the latest snapshot.
Loading...