Discussion:
[OpenWrt-Devel] How does OpenWRT start "network"?
Rafał Miłecki
2013-02-07 12:50:44 UTC
Permalink
When I just boot OpenWRT (or enter failsafe mode) I get such a
# ifconfig -a
eth0 Link encap:Ethernet HWaddr C8:3A:35:40:C1:88
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:4
lo Link encap:Local Loopback
LOOPBACK MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
However when I boot OpenWRT normally I get all the virtual interfaces
(bridges, VLANs, etc.):

[ 14.088000] device eth0.0 entered promiscuous mode
[ 14.092000] device eth0 entered promiscuous mode
[ 14.100000] br-lan: port 1(eth0.0) entered forwarding state
[ 14.104000] br-lan: port 1(eth0.0) entered forwarding state
[ 16.108000] br-lan: port 1(eth0.0) entered forwarding state
# ifconfig -a
br-lan Link encap:Ethernet HWaddr C8:3A:35:40:C1:88
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:402 (402.0 B)
eth0 Link encap:Ethernet HWaddr C8:3A:35:40:C1:88
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:4
eth0.0 Link encap:Ethernet HWaddr C8:3A:35:40:C1:88
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:402 (402.0 B)
eth0.1 Link encap:Ethernet HWaddr C8:3A:35:40:C1:88
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:1965 (1.9 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:48 errors:0 dropped:0 overruns:0 frame:0
TX packets:48 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3264 (3.1 KiB) TX bytes:3264 (3.1 KiB)
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.

I've seen
http://wiki.openwrt.org/doc/networking/start
http://wiki.openwrt.org/doc/networking/network.interfaces
http://wiki.openwrt.org/doc/uci/network
but they don't explain what scripts/tools start the network.

Can you point me to them? How can I manually perform the same
operations manually, one by one?
--
Rafał
Felix Fietkau
2013-02-07 13:36:56 UTC
Permalink
Post by Rafał Miłecki
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.
I've seen
http://wiki.openwrt.org/doc/networking/start
http://wiki.openwrt.org/doc/networking/network.interfaces
http://wiki.openwrt.org/doc/uci/network
but they don't explain what scripts/tools start the network.
Can you point me to them? How can I manually perform the same
operations manually, one by one?
Network isn't being set up by scripts anymore. The 'netifd' package
manages the state of all network interfaces.
What's interesting about the output that you posted is that the tx
packet counter stays at 0, so there aren't that many possibilities about
what's going on. My guess is one of the following:

- link state is down
- tx packets are being dropped silently in the driver
- the MAC is locked up and tx packets are being held in the queue
(though shouldn't that produce netdev tx timeout warnings?)

The important difference between failsafe and normal mode is probably
the bridge, as it enables promisc mode.

- Felix
Rafał Miłecki
2013-02-07 15:59:07 UTC
Permalink
Post by Felix Fietkau
Post by Rafał Miłecki
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.
I've seen
http://wiki.openwrt.org/doc/networking/start
http://wiki.openwrt.org/doc/networking/network.interfaces
http://wiki.openwrt.org/doc/uci/network
but they don't explain what scripts/tools start the network.
Can you point me to them? How can I manually perform the same
operations manually, one by one?
Network isn't being set up by scripts anymore. The 'netifd' package
manages the state of all network interfaces.
Hm, so it reads /etc/config/network directly? OK, I've cleaned that
file, rebooted, and indeed! None virtual interface was created!
Post by Felix Fietkau
What's interesting about the output that you posted is that the tx
packet counter stays at 0, so there aren't that many possibilities about
- link state is down
- tx packets are being dropped silently in the driver
- the MAC is locked up and tx packets are being held in the queue
(though shouldn't that produce netdev tx timeout warnings?)
The important difference between failsafe and normal mode is probably
the bridge, as it enables promisc mode.
I've tried "ifconfig eth0 promisc" and it didn't stop my interface. So
there must be different problem, maybe related to the virtual
interfaces or bridge...

I'll try to reproduce netifd's configuration by doing the same
manually (ifconfig and friends). Maybe that way I'll spot the moment
when interface dies. I'll Google for it soon, but if someone has a
nice howto on creating virtual interfaces and bridge, I'll glad to get
the link ;)



The weird thing is that even after pinging my router about ~100 time,
I still get
Post by Felix Fietkau
RX packets:0 errors:0 dropped:1 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
Whoops, there must be a bug in bgmac!
--
Rafał
Rafał Miłecki
2013-02-07 16:02:29 UTC
Permalink
Post by Rafał Miłecki
The weird thing is that even after pinging my router about ~100 time,
I still get
Post by Felix Fietkau
RX packets:0 errors:0 dropped:1 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
Whoops, there must be a bug in bgmac!
OK, I can see that: get_ethtool_stats (or to be precisely: the lack of it).
--
Rafał
Rafał Miłecki
2013-02-07 17:50:01 UTC
Permalink
Post by Rafał Miłecki
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.
# brctl show
bridge name bridge id STP enabled interfaces
br-lan 8000.c83a3540c1a8 no eth0.0
It seems there is something wrong with eth0.0. When I switch br-lan to
Post by Rafał Miłecki
# brctl delif br-lan eth0.0
# brctl addif br-lan eth0
Ethernet starts working!

Please see attached file for log with "ifconfig -a" calls.

Does anyone have any idea what may be wrong with that eth0.0
interface? Why using it for br-lan fails? Why swtiching br-lan to eth0
succeeds? I expected eth0.0 to work the same way as eth0.


There is one more tricky part. You may wonder what happens if I switch
br-lan back to eth0.0 (from eth0). Well pinging works again, but only
for a limited amount of time. Sometimes I can't see even one
"ping-pong" and sometimes it keeps working even for 20 seconds.
Post by Rafał Miłecki
brctl delif br-lan eth0.0
brctl addif br-lan eth0
brctl delif br-lan eth0
brctl addif br-lan eth0.0
Makes pinging work again for few seconds.
--
Rafał
Simon G
2013-02-07 21:14:38 UTC
Permalink
This post might be inappropriate. Click to display it.
Hauke Mehrtens
2013-02-07 21:37:36 UTC
Permalink
Post by Rafał Miłecki
Post by Rafał Miłecki
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.
# brctl show
bridge name bridge id STP enabled interfaces
br-lan 8000.c83a3540c1a8 no eth0.0
It seems there is something wrong with eth0.0. When I switch br-lan to
Post by Rafał Miłecki
# brctl delif br-lan eth0.0
# brctl addif br-lan eth0
Ethernet starts working!
Please see attached file for log with "ifconfig -a" calls.
Does anyone have any idea what may be wrong with that eth0.0
interface? Why using it for br-lan fails? Why swtiching br-lan to eth0
succeeds? I expected eth0.0 to work the same way as eth0.
There is one more tricky part. You may wonder what happens if I switch
br-lan back to eth0.0 (from eth0). Well pinging works again, but only
for a limited amount of time. Sometimes I can't see even one
"ping-pong" and sometimes it keeps working even for 20 seconds.
Post by Rafał Miłecki
brctl delif br-lan eth0.0
brctl addif br-lan eth0
brctl delif br-lan eth0
brctl addif br-lan eth0.0
Makes pinging work again for few seconds.
I assume the switch driver you are using is broken. I do not get some of
my devices to work with the bgmac driver + BCM53125 at all. After some
modifications to the switch driver I was able to transmit some packages
but did not receive anything. I do not want to invest many efforts in
the old switch driver that's the reason I want to get a separate phy
driver and use the b53 switch driver any try this out or fix it if
necessary. I assume your problem is also related to the switch.

Did you apply the patches I added to OpenWrt [0] and send for linux
mainline kernel inclusion? They are needed to make bridge the switch
interface, otherwise you get the problems described in this ticket [1].

Hauke

[0]: https://dev.openwrt.org/changeset/35507
[1]: https://dev.openwrt.org/ticket/12927
Florian Fainelli
2013-02-07 21:47:49 UTC
Permalink
Post by Hauke Mehrtens
Post by Rafał Miłecki
Post by Rafał Miłecki
The problem is that in failsafe my ethernet driver works fine (I can
ping router and the other way), but in normal mode (with virtual
interfaces) it stops working. My intention is to create all virtual
interfaces manually, to see where exactly my eth driver fails.
# brctl show
bridge name bridge id STP enabled interfaces
br-lan 8000.c83a3540c1a8 no eth0.0
It seems there is something wrong with eth0.0. When I switch br-lan to
Post by Rafał Miłecki
# brctl delif br-lan eth0.0
# brctl addif br-lan eth0
Ethernet starts working!
Please see attached file for log with "ifconfig -a" calls.
Does anyone have any idea what may be wrong with that eth0.0
interface? Why using it for br-lan fails? Why swtiching br-lan to eth0
succeeds? I expected eth0.0 to work the same way as eth0.
There is one more tricky part. You may wonder what happens if I switch
br-lan back to eth0.0 (from eth0). Well pinging works again, but only
for a limited amount of time. Sometimes I can't see even one
"ping-pong" and sometimes it keeps working even for 20 seconds.
Post by Rafał Miłecki
brctl delif br-lan eth0.0
brctl addif br-lan eth0
brctl delif br-lan eth0
brctl addif br-lan eth0.0
Makes pinging work again for few seconds.
I assume the switch driver you are using is broken. I do not get some of
my devices to work with the bgmac driver + BCM53125 at all. After some
modifications to the switch driver I was able to transmit some packages
but did not receive anything. I do not want to invest many efforts in
the old switch driver that's the reason I want to get a separate phy
driver and use the b53 switch driver any try this out or fix it if
necessary. I assume your problem is also related to the switch.
Another thing you might want to check, is to ensure that your driver accepts
and correctly processes ethernet frames with a vlan tag (especially enough
room was made for the incoming skb etc ...) in both TX and RX paths.
Post by Hauke Mehrtens
Did you apply the patches I added to OpenWrt [0] and send for linux
mainline kernel inclusion? They are needed to make bridge the switch
interface, otherwise you get the problems described in this ticket [1].
Hauke
--
Florian
Rafał Miłecki
2013-02-08 07:49:27 UTC
Permalink
Post by Florian Fainelli
Another thing you might want to check, is to ensure that your driver accepts
and correctly processes ethernet frames with a vlan tag (especially enough
room was made for the incoming skb etc ...) in both TX and RX paths.
The thing that bothers me is that communication break after ~20
seconds. With some bug in bgmac I expected issue to be binary: it
works or doesn't.

I've put some debugging messages in bgmac and:
1) Nothing in bgmac (like bgmac_set_rx_mode or bgmac_set_mac_address
or bgmac_ioctl) gets called when eth0.0 stops working after ~20
seconds
2) There are still packets coming on the interface! netif_receive_skb
is still getting called about 2times per second (during pinging
router).

So there are two options:
1) After some time received packages are getting corrupted and some
upper layer silently ignores them
2) Some upper layer starts filtering or sth. incoming packets

Will work on that.
--
Rafał
Rafał Miłecki
2013-02-08 10:02:54 UTC
Permalink
Post by Rafał Miłecki
Post by Florian Fainelli
Another thing you might want to check, is to ensure that your driver accepts
and correctly processes ethernet frames with a vlan tag (especially enough
room was made for the incoming skb etc ...) in both TX and RX paths.
The thing that bothers me is that communication break after ~20
seconds. With some bug in bgmac I expected issue to be binary: it
works or doesn't.
1) Nothing in bgmac (like bgmac_set_rx_mode or bgmac_set_mac_address
or bgmac_ioctl) gets called when eth0.0 stops working after ~20
seconds
2) There are still packets coming on the interface! netif_receive_skb
is still getting called about 2times per second (during pinging
router).
1) After some time received packages are getting corrupted and some
upper layer silently ignores them
2) Some upper layer starts filtering or sth. incoming packets
Anyone interested in looking at attached file?

1) Right after booting OpenWRT (when pinging doesn't work) I get 1
packet / ping request with len 0x40

2) After switching br-lan from eth0.0 to eth0 pinging works and I get
1 packet / ping request with len 0x66. You can see there were also 3
extra packets with len 0x40, 0x40 and 0x52. The ones with len 0x66
were most probably coming from ping requests (there are ten of them).

3) After switching br-lan back to eth0.0 I still get mostly
0x66-length packets. Again, there are three 0x40-len packets around.
Pinging works.

4) 2 seconds later I started pinging again. No response. There are
again ten 0x66 packets, but there are also more 0x44-len packets.

Does it make sense to you? I still have to find out if bgmac transmits
packets correctly... To actually understand which way the transmission
fails.
--
Rafał
Florian Fainelli
2013-02-08 11:24:40 UTC
Permalink
Rafal,
Post by Rafał Miłecki
Post by Rafał Miłecki
Post by Florian Fainelli
Another thing you might want to check, is to ensure that your driver accepts
and correctly processes ethernet frames with a vlan tag (especially enough
room was made for the incoming skb etc ...) in both TX and RX paths.
The thing that bothers me is that communication break after ~20
seconds. With some bug in bgmac I expected issue to be binary: it
works or doesn't.
1) Nothing in bgmac (like bgmac_set_rx_mode or bgmac_set_mac_address
or bgmac_ioctl) gets called when eth0.0 stops working after ~20
seconds
2) There are still packets coming on the interface! netif_receive_skb
is still getting called about 2times per second (during pinging
router).
1) After some time received packages are getting corrupted and some
upper layer silently ignores them
2) Some upper layer starts filtering or sth. incoming packets
Anyone interested in looking at attached file?
What exactly does this dump represent? I cannot make sense out of it as
I do not even see some kind of Ethernet frame header?
Post by Rafał Miłecki
1) Right after booting OpenWRT (when pinging doesn't work) I get 1
packet / ping request with len 0x40
2) After switching br-lan from eth0.0 to eth0 pinging works and I get
1 packet / ping request with len 0x66. You can see there were also 3
extra packets with len 0x40, 0x40 and 0x52. The ones with len 0x66
were most probably coming from ping requests (there are ten of them).
3) After switching br-lan back to eth0.0 I still get mostly
0x66-length packets. Again, there are three 0x40-len packets around.
Pinging works.
4) 2 seconds later I started pinging again. No response. There are
again ten 0x66 packets, but there are also more 0x44-len packets.
Does it make sense to you? I still have to find out if bgmac transmits
packets correctly... To actually understand which way the transmission
fails.
_______________________________________________
openwrt-devel mailing list
https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Rafał Miłecki
2013-02-08 11:45:29 UTC
Permalink
Post by Rafał Miłecki
Anyone interested in looking at attached file?
What exactly does this dump represent? I cannot make sense out of it as I do
not even see some kind of Ethernet frame header?
Crap. That's the whole received packet, including Broadcom header at
the beginning. I'll fix my dumping & compare with Wireshark on PC.
--
Rafał
Rafał Miłecki
2013-02-08 12:41:00 UTC
Permalink
Post by Rafał Miłecki
Post by Rafał Miłecki
Anyone interested in looking at attached file?
What exactly does this dump represent? I cannot make sense out of it as I do
not even see some kind of Ethernet frame header?
Crap. That's the whole received packet, including Broadcom header at
the beginning. I'll fix my dumping & compare with Wireshark on PC.
OK, I've a better log now (from not working network). All packets seem
to be transmitted, but the length looks wrong...
--
Rafał
Rafał Miłecki
2013-02-08 21:34:32 UTC
Permalink
Post by Rafał Miłecki
OK, I've a better log now (from not working network). All packets seem
to be transmitted, but the length looks wrong...
Another try, this time I didn't mess with pinging PC from the router.
Only pinging router from the PC.

That's crazy there are 5 ARP packets received by the bgmac with the
Post by Rafał Miłecki
FF FF FF FF FF FF 00 1D BA 19 9E DB 08 06 00 01
08 00 06 04 00 01 00 1D BA 19 9E DB C0 A8 01 02
00 00 00 00 00 00 C0 A8 01 01 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 09 22 BE 18
However there is nothing in the system providing an answer.

Now: after typing
# brctl delif br-lan eth0.0
# brctl addif br-lan eth0

The same ARP packet arrives to the router (exactly the same content)
and... hooray, there appears an answer, that is passed to the bgmac
(ndo_start_xmit gets called)! Maybe I should track what does happen in
netif_receive_skb in case of using eth0.0 and in case of eth0... :|
--
Rafał
Felix Fietkau
2013-02-08 21:52:29 UTC
Permalink
Post by Rafał Miłecki
Post by Rafał Miłecki
OK, I've a better log now (from not working network). All packets seem
to be transmitted, but the length looks wrong...
Another try, this time I didn't mess with pinging PC from the router.
Only pinging router from the PC.
That's crazy there are 5 ARP packets received by the bgmac with the
Post by Rafał Miłecki
FF FF FF FF FF FF 00 1D BA 19 9E DB 08 06 00 01
08 00 06 04 00 01 00 1D BA 19 9E DB C0 A8 01 02
00 00 00 00 00 00 C0 A8 01 01 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 09 22 BE 18
However there is nothing in the system providing an answer.
Now: after typing
# brctl delif br-lan eth0.0
# brctl addif br-lan eth0
The same ARP packet arrives to the router (exactly the same content)
and... hooray, there appears an answer, that is passed to the bgmac
(ndo_start_xmit gets called)! Maybe I should track what does happen in
netif_receive_skb in case of using eth0.0 and in case of eth0... :|
Ah, so you don't have the switch configured to send vlan0-tagged
packets. Untagged packets won't make it to eth0.0

- Felix

Rafał Miłecki
2013-02-08 07:34:37 UTC
Permalink
Post by Hauke Mehrtens
I assume the switch driver you are using is broken. I do not get some of
my devices to work with the bgmac driver + BCM53125 at all. After some
modifications to the switch driver I was able to transmit some packages
but did not receive anything. I do not want to invest many efforts in
the old switch driver that's the reason I want to get a separate phy
driver and use the b53 switch driver any try this out or fix it if
necessary. I assume your problem is also related to the switch.
As said above, I disabled switch drivers to analyze "pure" bgmac first.
Post by Hauke Mehrtens
Did you apply the patches I added to OpenWrt [0] and send for linux
mainline kernel inclusion? They are needed to make bridge the switch
interface, otherwise you get the problems described in this ticket [1].
[0]: https://dev.openwrt.org/changeset/35507
Of course.
Post by Hauke Mehrtens
[1]: https://dev.openwrt.org/ticket/12927
I don't have or load any WiFi driver.
--
Rafał
Rafał Miłecki
2013-02-08 07:31:03 UTC
Permalink
Post by Simon G
Could it have something to do with ethernet VLAN ?
Do you have any control of the ethernet switch (Internal on your router)?
VLAN enabled etc.
My device has switch, but I don't touch it. I didn't compile
kmod-switch or swconfig.
Post by Simon G
Try if eth0.1 works instead.
The same story. It doesn't work, however if I switch br-lan from
"eth0" to "eth0.1" it works for ~20 seconds.
--
Rafał
Pietro Paolini
2013-02-08 10:51:14 UTC
Permalink
Hello all,

I have a simple question related to the building system.

In the .config file I can read different type of line, like :

CONFIG_TARGET
CONFIG_DEFAULT
CONFIG_PACKAGE
Š

These variables are simply passed at the other makefiles in the
subdirectory or there are other purpose of them ?

Let's a package X be non selected, it will not be compiled because the
Makefile of the X package see the CONFIG_PACKAGE_X=n or because it isn't
called by the build system ?
There is a (short) paper where can I find answer for this type of
questions ?

Many thanks !

Pietro.
Sergey Ryazanov
2013-02-08 11:22:14 UTC
Permalink
Post by Pietro Paolini
These variables are simply passed at the other makefiles in the
subdirectory or there are other purpose of them ?
Some of them used by building system (e.g. CONFIG_PACKAGE_foo), the
rest of variables are used by particular package.
Post by Pietro Paolini
Let's a package X be non selected, it will not be compiled because the
Makefile of the X package see the CONFIG_PACKAGE_X=n or because it isn't
called by the build system ?
If the package is not selected, then the Makefile will not even be called.
Post by Pietro Paolini
There is a (short) paper where can I find answer for this type of
questions ?
Start reading here [1]. Actually, google know yet more.

1. http://wiki.openwrt.org/doc/devel/packages
--
BR,
Sergey
Loading...