6723 Commits

Author SHA1 Message Date
David Ahern
977d51cfec Merge remote-tracking branch 'main/main' into next
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-05-03 15:40:02 +00:00
Lukasz Majewski
c72323d2ef ip link: hsr: Add support for passing information about INTERLINK device
The HSR capable device can operate in two modes of operations -
Doubly Attached Node for HSR (DANH) and RedBOX (HSR-SAN).

The latter one allows connection of non-HSR aware device(s) to HSR
network.
This node is called SAN (Singly Attached Network) and is connected via
INTERLINK network device.

This patch adds support for passing information about the INTERLINK
device, so the Linux driver can properly setup it.

Signed-off-by: Lukasz Majewski <lukma@denx.de>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-05-03 15:19:30 +00:00
David Ahern
0475c997c0 Update kernel headers
Update kernel headers to commit:
    5829614a7b3b ("Merge branch 'net-sysctl-sentinel'")

Signed-off-by: David Ahern <dsahern@kernel.org>
2024-05-03 15:18:43 +00:00
Chiara Meiohas
57d7a8fd90 rdma: Add an option to display driver-specific QPs in the rdma tool
Utilize the -dd flag (driver-specific details) in the rdmatool
to view driver-specific QPs which are not exposed yet.

The following examples show mlx5 UMR QP which is visible now:

$ rdma resource show qp link ibp8s0f1
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]

$ rdma resource show qp link ibp8s0f1 -dd
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]

$ rdma resource show
0: ibp8s0f0: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2
1: ibp8s0f1: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2

$ rdma resource show -dd
0: ibp8s0f0: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2
1: ibp8s0f1: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-05-03 15:15:22 +00:00
Chiara Meiohas
e459ea4392 rdma: update uapi header
Update rdma_netlink.h file up to kernel commit e18fa0bbcedf
("RDMA/core: Add an option to display driver-specific QPs in the rdmatool")

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-05-03 15:14:55 +00:00
Stephen Hemminger
89210b9ec1 uapi: update vdpa.h
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-29 11:28:23 -07:00
Yedaya Katsman
70ba338cd8 ip: Exit exec in child process if setup fails
If we forked, returning from the function will make the calling code to
continue in both the child and parent process. Make cmd_exec exit if
setup failed and it forked already.

An example of issues this causes, where a failure in setup causes
multiple unnecessary tries:

```
$ ip netns
ef
ab
$ ip -all netns exec ls

netns: ef
setting the network namespace "ef" failed: Operation not permitted

netns: ab
setting the network namespace "ab" failed: Operation not permitted

netns: ab
setting the network namespace "ab" failed: Operation not permitted
```

Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-25 12:00:25 -07:00
David Ahern
c34ca74085 Merge branch 'pfcp' into next
Wojciech Drewek  says:

====================

New PFCP module was accepted in the kernel together with cls_flower
changes which allow to filter the packets using PFCP specific fields [1].
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [2].

Extended ip link with the support for the new PFCP device.
Add pfcp_opts support in tc-flower.

[1] https://lore.kernel.org/netdev/171196563119.11638.12210788830829801735.git-patchwork-notify@kernel.org/
[2] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-23 16:29:52 +00:00
Stephen Hemminger
911c62bf9d use missing argument helper
There is a helper in utilities to handle missing argument,
but it was not being used consistently.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-23 09:01:46 -07:00
Michal Swiatkowski
976dca372e f_flower: implement pfcp opts
Allow adding tc filter for PFCP header.

Add support for parsing TCA_FLOWER_KEY_ENC_OPTS_PFCP.
Options are as follows: TYPE:SEID.

TYPE is a 8-bit value represented in hex and can be  1
for session header and 0 for node header. In PFCP packet
this is S flag in header.

SEID is a 64-bit session id value represented in hex.

This patch enables adding hardware filters using PFCP fields, see [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d823265dd45bbf14bd67aa476057108feb4143ce

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-23 15:34:19 +00:00
Wojciech Drewek
a25f6771be ip: PFCP device support
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [1]. Add support for PFCP device type in ip link.
It is capable of receiving PFCP messages and extracting its
metadata (session ID).

Its only purpose is to be used together with tc flower to create
SW/HW filters.

PFCP module does not take any netlink attributes so there is no
need to parse any args. Add new sections to the man to let the
user know about new device type.

[1] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-23 15:31:23 +00:00
Jiayun Chen
11543416d9 man: fix doc, ip link does support "change"
ip link does support "change".

if (matches(*argv, "set") == 0 ||
    matches(*argv, "change") == 0)
    return iplink_modify(RTM_NEWLINK, 0,
                 argc-1, argv+1);

The attached patch documents this.

Signed-off-by: Jiayun Chen <jiayunchen@smail.nju.edu.cn>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-20 20:07:28 -07:00
Stephen Hemminger
95c886b8e8 tc/util: remove unused argument from print_tcstats2_attr
The function doesn't use the FILE handle.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:45:48 +00:00
Stephen Hemminger
6879d2046b tc/police: remove unused argument to tc_print_police
FILE handle no longer used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:45:38 +00:00
Stephen Hemminger
2c42df8689 tc/util: remove unused argument from print_action_control
The FILE handle is no longer used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:43:52 +00:00
Stephen Hemminger
bf4022ebe6 tc/util: remove unused argument from print_tm
File argument no longer used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:41:56 +00:00
Stephen Hemminger
98b7262c12 tc/u32: remove FILE argument
The pretty printing routines no longer use the file handle.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:12:51 +00:00
David Ahern
e7b4fcb2af Merge remote-tracking branch 'main/main' into next
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21 01:12:29 +00:00
Arınç ÜNAL
dedcf62f39 man: use clsact qdisc for port mirroring examples on matchall and mirred
The clsact qdisc supports ingress and egress. Instead of using two qdiscs
to do ingress and egress port mirroring, clsact can be used. Therefore, use
clsact for the port mirroring examples on the tc-matchall.8 and tc-mirred.8
documents.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-16 08:31:56 -07:00
Stephen Hemminger
0a1e1522cd mnl: initialize generic netlink version
The version field in mnlu was being passed in but never set.
This meant that all places mnlu_gen_socket was used, the version would
be uninitialized data from malloc().

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-15 09:13:21 -07:00
Geliang Tang
c78640535b ss: mptcp: print out last time counters
Three new "last time" counters have been added to "struct mptcp_info":
last_data_sent, last_data_recv and last_ack_recv. They have been added
in commit 18d82cde7432 ("mptcp: add last time fields in mptcp_info") in
net-next recently.

This patch prints out these new counters into mptcp_stats output in ss.

Signed-off-by: Geliang Tang <geliang@kernel.org>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13 16:43:04 +00:00
Parav Pandit
e8add23c59 devlink: Support setting max_io_eqs
Devices send event notifications for the IO queues,
such as tx and rx queues, through event queues.

Enable a privileged owner, such as a hypervisor PF, to set the number
of IO event queues for the VF and SF during the provisioning stage.

example:
Get maximum IO event queues of the VF device::

  $ devlink port show pci/0000:06:00.0/2
  pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
      function:
          hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10

Set maximum IO event queues of the VF device::

  $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32

  $ devlink port show pci/0000:06:00.0/2
  pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
      function:
          hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13 16:34:21 +00:00
David Ahern
e6a170a9d4 Update kernel headers
Update kernel headers to commit:
    32affa5578f0 ("fib: rules: no longer hold RTNL in fib_nl_dumprule()")

Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13 16:32:58 +00:00
renmingshuai
806b751a68 ip: Support filter links with no VF info
Kernel has add IFLA_EXT_MASK attribute for indicating that certain
extended ifinfo values are requested by the user application. The ip
link show cmd always request VFs extended ifinfo.

In this case, RTM_GETLINK for greater than about 220 VFs truncates
IFLA_VFINFO_LIST due to the maximum reach of nlattr's nla_len being
exceeded. As a result, ip link show command only show the truncated
VFs info sucn as:

    #ip link show dev eth0
    1: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ...
        link/ether ...
        vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff ...
    Truncated VF list: eth0

This patch add novf to support filter links with no VF info:
ip link show novf

v2:
- use an one word option instead of an option with on/off.
- fix the issue that break changes made for the link filter
  already done for VF's.

v3:
- "novf" set vfinfo to 0 and the RTEXT_FILTER_VF flag is not added.

Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13 16:30:07 +00:00
Yusuke Ichiki
e67c9a7353 man: fix brief explanation of ip netns attach NAME PID
Rewrite the explanation as it was duplicated with that of
`ip netns add NAME`.

Signed-off-by: Yusuke Ichiki <public@yusuke.pub>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-03 10:13:52 -07:00
Max Gautier
f740f5a165 arpd: create /var/lib/arpd on first use
The motivation is to build distributions packages without /var to go
towards stateless systems, see link below (TL;DR: provisionning anything
outside of /usr on boot).

We only try do create the database directory when it's in the default
location, and assume its parent (/var/lib in the usual case) exists.

Links: https://0pointer.net/blog/projects/stateless.html
Signed-off-by: Max Gautier <mg@max.gautier.name>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-28 13:35:52 -07:00
Stephen Hemminger
037a3a0d66 ila: allow show, list and lst as synonyms
Across ip commands show, list and misspelling lst are treated
the same.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-28 13:33:05 -07:00
Date Huang
9ccf8fa8d4 bridge: vlan: fix compressvlans usage
Add the missing 'compressvlans' to man page

Signed-off-by: Date Huang <tjjh89017@hotmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-26 10:11:05 -07:00
Date Huang
43b5396863 bridge: vlan: fix compressvlans usage
Fix the incorrect short opt for compressvlans and color
in usage

Signed-off-by: Date Huang <tjjh89017@hotmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-26 10:11:05 -07:00
Stephen Hemminger
70e4a17624 uapi: update vdpa.h
Autogenerated from 6.9-rc1.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-24 18:16:06 -07:00
Denis Kirjanov
4da7bfbf91 ifstat: don't set errno if strdup fails
the strdup man page states that the errno value
set by the function so there is not need to set it.

Signed-off-by: Denis Kirjanov <dkirjanov@suse.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-19 21:17:55 -07:00
Denis Kirjanov
b22a3430bd ifstat: handle strdup return value
get_nlmsg_extended is missing the check as
it's done in get_nlmsg

v2: don't set the errno value explicitly

Signed-off-by: Denis Kirjanov <dkirjanov@suse.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-19 21:17:55 -07:00
Stephen Hemminger
4b3b5375a7 uapi: update headers
User headers based on pre 6.9-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-16 08:14:56 -07:00
David Ahern
7a6d30c95d Merge branch 'nexthop-grp-stats' into next
Petr Machata  says:

====================

Next hop group stats allow verification of balancedness of a next hop
group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch
'nexthop-group-stats'"). This patchset adds to ip the corresponding
support.

NH group stats come in two flavors: as statistics for SW and for HW
datapaths. The former is shown when -s is given to "ip nexthop". The latter
demands more work from the kernel, and possibly driver and HW, and might
not be always necessary. Therefore tie it to -s -s, similarly to how ip
link shows more detailed stats when -s is given twice.

Here's an example usage:

 # ip link add name gre1 up type gre \
      local 172.16.1.1 remote 172.16.1.2 tos inherit
 # ip nexthop replace id 1001 dev gre1
 # ip nexthop replace id 1002 dev gre1
 # ip nexthop replace id 1111 group 1001/1002 hw_stats on
 # ip -s -s -j -p nexthop show id 1111
 [ {
 	[ ...snip... ]
         "hw_stats": {
             "enabled": true,
             "used": true
         },
         "group_stats": [ {
                 "id": 1001,
                 "packets": 0,
                 "packets_hw": 0
             },{
                 "id": 1002,
                 "packets": 0,
                 "packets_hw": 0
             } ]
     } ]

hw_stats.enabled shows whether hw_stats have been requested for the given
group. hw_stats.used shows whether any driver actually implemented the
counter. group_stats[].packets show the total stats, packets_hw only the
HW-datapath stats.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:05:23 +00:00
Petr Machata
69d1c2c4aa ip: ipnexthop: Allow toggling collection of nexthop group HW statistics
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Since collection of
these statistics may consume HW resources, there is an interface to request
that the HW stats be recorded. Add this toggle to "ip nexthop".

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:03:38 +00:00
Petr Machata
a50655e730 ip: ipnexthop: Support dumping next hop group HW stats
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Request that these be
collected when ip is given "-s -s", similarly to how "ip link" shows more
statistics in that case.

Besides the statistics themselves, also show whether the collection of HW
statistics was in fact requested, and whether any driver actually
implemented the request.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:03:34 +00:00
Petr Machata
529ada74c4 ip: ipnexthop: Support dumping next hop group stats
Next hop group stats allow verification of balancedness of a next hop
group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch
'nexthop-group-stats'"). Add to ip the corresponding support. The
statistics are requested if "ip nexthop" is started with -s.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:03:09 +00:00
Petr Machata
95836fbf35 libnetlink: Add rta_getattr_uint()
NLA_UINT attributes have a 4-byte payload if possible, and an 8-byte one if
necessary. Add a function to extract these. Since we need to dispatch on
length anyway, make the getter truly universal by supporting also u8 and
u16.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:03:06 +00:00
David Ahern
8b3b71898d Update kernel headers
Update kernel headers to commit:
    237bb5f7f7f5 ("cxgb4: unnecessary check for 0 in the free_sge_txq_uld() function")

Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15 15:02:15 +00:00
Stephen Hemminger
11740815bf tc-simple.8: take Jamal's prompt off examples
The examples on tc-simple man page had extra stuff in
the prompt which is not necessary.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:07:33 -07:00
Stephen Hemminger
69d55c213d simple: support json output
Last action that never got JSON support.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:07:33 -07:00
Stephen Hemminger
af0ddbfa51 skbmod: support json in print
This tc action never got jsonized.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:07:33 -07:00
Stephen Hemminger
ba52b3d4dd pedit: log errors to stderr
The errors should bo to stderr, not to stdout.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:07:33 -07:00
Stephen Hemminger
fc4226d247 tc: support JSON for legacy stats
The extended stats already supported JSON output, add to the
legacy stats as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:07:33 -07:00
Luca Boccassi
f31afe64d6 man: fix typo found by Lintian
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 10:02:26 -07:00
Stephen Hemminger
38656eeb35 tc: remove no longer used helpers
The removal of tick usage in netem, means that some of the
helper functions in tc are no longer used and can be safely removed.
Other functions can be made static.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 09:56:29 -07:00
Stephen Hemminger
9a6b231ea1 netem: use 64 bit value for latency and jitter
The current version of netem in iproute2 has a maximum of 4.3 seconds
because of scaled 32 bit clock values. Some users would like to be
able to use larger delays to emulate things like storage delays.

Since kernel version 4.15, netem qdisc had netlink parameters
to express wider range of delays in nanoseconds. But the iproute2
side was never updated to use them.

This does break compatibility with older kernels (4.14 and earlier).
With these out of support kernels, the latency/delay parameter
will end up being ignored.

Reported-by: Marc Blanchet <marc.blanchet@viagenie.ca>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 09:54:44 -07:00
Stephen Hemminger
56511223ef README: add note about kernel version compatibility
Since next netem changes will break some usages of out of support kernels,
add an explicit policy about range of kernel versions.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13 09:43:56 -07:00
Stephen Hemminger
9fb634deec tc: make exec_util arg const
The callbacks in exec_util should not be modifying underlying
qdisc operations structure.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12 15:11:43 -07:00
Stephen Hemminger
38b0e6c120 tc: make action_util arg const
The callbacks in action_util should not be modifying underlying
qdisc operations structure.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12 15:11:43 -07:00