1
0
Fork 0

This has been a large cycle for RDMA, with several major patch series

reworking parts of the core code.
 
 - Rework the so-called 'gid cache' and internal APIs to use a kref'd
   pointer to a struct instead of copying, push this upwards into the
   callers and add more stuff to the struct. The new design avoids some
   ugly races the old one suffered with. This is part of the namespace
   enablement work as the new struct is learning to be namespace aware.
 
 - Various uapi cleanups, moving more stuff to include/uapi and fixing some
   long standing bugs that have recently been discovered.
 
 - Driver updates for mlx5, mlx4 i40iw, rxe, cxgb4, hfi1, usnic, pvrdma,
   and hns
 
 - Provide max_send_sge and max_recv_sge attributes to better support HW
   where these values are asymmetric.
 
 - mlx5 user API 'devx' allows sending commands directly to the device FW,
   instead of trying to cram every wild and niche feature into the common
   API. Sort of like what GPU does.
 
 - Major write() and ioctl() API rework to cleanly support PCI device hot
   unplug and advance the ioctl conversion work
 
 - Sparse and compile warning cleanups
 
 - Add 'const' to the ib_poll_cq() signature, and permit a NULL 'bad_wr',
   which is the common use case
 
 - Various patches to avoid high order allocations across the stack
 
 - SRQ support for cxgb4, hns and qedr
 
 - Changes to IPoIB to better follow the netdev model for working with
   struct net_device liftime
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlt17oMACgkQOG33FX4g
 mxpRsQ//YZY1Gci1IoYLMuq0Rn9+/4lRHaBev+B728z1dvEFBW8m/i2DV5dPnSxO
 AUN9dZOKBYYhc08h8vphtnBdMEtYJz6Dl76F8W+mt5vSuM5D4+0ba415RYSnV1Dc
 d6Js33OTMVbQVHmYCIAXh9FNDX8lkywT346aXlMOpW3z74xoaLkkQ0cnfB0SEX0y
 q9jiu70s6eisLlu9zJsXmCCLQ1b8eUD6IZm7hX8wMheuhDWyfrOv8JBeBCQdICuI
 MASc2T7X8E++dvIePAL7Hgx/0SH/2Mit8zaJ0Sbt2OjBDcImLSs8bcple5gPoCPk
 3vnCdb2GKg8xlxe3n1S89sGC1b8MY2CtQFElSs9C6npIGCwr2XlrZDDa0tE45+8I
 miVhoswakmKW61KTCkVf2d9RXWcIh1qwUIpan1aZMsWdNnA6FYXIF054mMmJO44+
 HUi2C93zAhx3XhFuX6O2YAHkG6CSXcZPfO7U9zy++GwAoXtGU0g6OLZbaYdEfuQh
 lN8LLqxe3M5sMdDnHYc38AsLW9MmxyJXt+h2yLxtsdZ9jitypBDQxSVfAI68RNwL
 BB1qELflF9FtAousQU9qhdNHimsgwctJ9MoZ6I1Aa1+ovwcSQgmKoQlNJIHkFroB
 wUz2sz6q25OdLWDpFrGipmG7Kfnosg7xuBSYZUQMBzLmjg0HTVY=
 =F50c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a large cycle for RDMA, with several major patch series
  reworking parts of the core code.

   - Rework the so-called 'gid cache' and internal APIs to use a kref'd
     pointer to a struct instead of copying, push this upwards into the
     callers and add more stuff to the struct. The new design avoids
     some ugly races the old one suffered with. This is part of the
     namespace enablement work as the new struct is learning to be
     namespace aware.

   - Various uapi cleanups, moving more stuff to include/uapi and fixing
     some long standing bugs that have recently been discovered.

   - Driver updates for mlx5, mlx4 i40iw, rxe, cxgb4, hfi1, usnic,
     pvrdma, and hns

   - Provide max_send_sge and max_recv_sge attributes to better support
     HW where these values are asymmetric.

   - mlx5 user API 'devx' allows sending commands directly to the device
     FW, instead of trying to cram every wild and niche feature into the
     common API. Sort of like what GPU does.

   - Major write() and ioctl() API rework to cleanly support PCI device
     hot unplug and advance the ioctl conversion work

   - Sparse and compile warning cleanups

   - Add 'const' to the ib_poll_cq() signature, and permit a NULL
     'bad_wr', which is the common use case

   - Various patches to avoid high order allocations across the stack

   - SRQ support for cxgb4, hns and qedr

   - Changes to IPoIB to better follow the netdev model for working with
     struct net_device liftime"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (312 commits)
  Revert "net/smc: Replace ib_query_gid with rdma_get_gid_attr"
  RDMA/hns: Fix usage of bitmap allocation functions return values
  IB/core: Change filter function return type from int to bool
  IB/core: Update GID entries for netdevice whose mac address changes
  IB/core: Add default GIDs of the bond master netdev
  IB/core: Consider adding default GIDs of bond device
  IB/core: Delete lower netdevice default GID entries in bonding scenario
  IB/core: Avoid confusing del_netdev_default_ips
  IB/core: Add comment for change upper netevent handling
  qedr: Add user space support for SRQ
  qedr: Add support for kernel mode SRQ's
  qedr: Add wrapping generic structure for qpidr and adjust idr routines.
  IB/mlx5: Fix leaking stack memory to userspace
  Update the e-mail address of Bart Van Assche
  IB/ucm: Fix compiling ucm.c
  IB/uverbs: Do not check for device disassociation during ioctl
  IB/uverbs: Remove struct uverbs_root_spec and all supporting code
  IB/uverbs: Use uverbs_api to unmarshal ioctl commands
  IB/uverbs: Use uverbs_alloc for allocations
  IB/uverbs: Add a simple allocator to uverbs_attr_bundle
  ...
hifive-unleashed-5.1
Linus Torvalds 2018-08-17 12:44:48 -07:00
commit 9bd553929f
233 changed files with 12440 additions and 7119 deletions

View File

@ -31,6 +31,8 @@ Arnaud Patard <arnaud.patard@rtp-net.org>
Arnd Bergmann <arnd@arndb.de>
Axel Dyks <xl@xlsigned.net>
Axel Lin <axel.lin@gmail.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@wdc.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@sandisk.com>
Ben Gardner <bgardner@wabtec.com>
Ben M Cahill <ben.m.cahill@intel.com>
Björn Steinbrink <B.Steinbrink@gmx.de>

View File

@ -3536,7 +3536,6 @@ F: drivers/net/ethernet/cisco/enic/
CISCO VIC LOW LATENCY NIC DRIVER
M: Christian Benvenuti <benve@cisco.com>
M: Dave Goodell <dgoodell@cisco.com>
S: Supported
F: drivers/infiniband/hw/usnic/
@ -7623,9 +7622,8 @@ S: Maintained
F: drivers/firmware/iscsi_ibft*
ISCSI EXTENSIONS FOR RDMA (ISER) INITIATOR
M: Or Gerlitz <ogerlitz@mellanox.com>
M: Sagi Grimberg <sagi@grimberg.me>
M: Roi Dayan <roid@mellanox.com>
M: Max Gurtovoy <maxg@mellanox.com>
L: linux-rdma@vger.kernel.org
S: Supported
W: http://www.openfabrics.org
@ -12754,15 +12752,21 @@ S: Maintained
F: drivers/scsi/sr*
SCSI RDMA PROTOCOL (SRP) INITIATOR
M: Bart Van Assche <bart.vanassche@sandisk.com>
M: Bart Van Assche <bvanassche@acm.org>
L: linux-rdma@vger.kernel.org
S: Supported
W: http://www.openfabrics.org
Q: http://patchwork.kernel.org/project/linux-rdma/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/dad/srp-initiator.git
F: drivers/infiniband/ulp/srp/
F: include/scsi/srp.h
SCSI RDMA PROTOCOL (SRP) TARGET
M: Bart Van Assche <bvanassche@acm.org>
L: linux-rdma@vger.kernel.org
L: target-devel@vger.kernel.org
S: Supported
Q: http://patchwork.kernel.org/project/linux-rdma/list/
F: drivers/infiniband/ulp/srpt/
SCSI SG DRIVER
M: Doug Gilbert <dgilbert@interlog.com>
L: linux-scsi@vger.kernel.org

View File

@ -37,7 +37,7 @@ config INFINIBAND_USER_ACCESS
config INFINIBAND_USER_ACCESS_UCM
bool "Userspace CM (UCM, DEPRECATED)"
depends on BROKEN
depends on BROKEN || COMPILE_TEST
depends on INFINIBAND_USER_ACCESS
help
The UCM module has known security flaws, which no one is

View File

@ -35,6 +35,7 @@ ib_ucm-y := ucm.o
ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
rdma_core.o uverbs_std_types.o uverbs_ioctl.o \
uverbs_ioctl_merge.o uverbs_std_types_cq.o \
uverbs_std_types_cq.o \
uverbs_std_types_flow_action.o uverbs_std_types_dm.o \
uverbs_std_types_mr.o uverbs_std_types_counters.o
uverbs_std_types_mr.o uverbs_std_types_counters.o \
uverbs_uapi.o

View File

@ -188,7 +188,7 @@ static int ib_nl_ip_send_msg(struct rdma_dev_addr *dev_addr,
return -ENODATA;
}
int rdma_addr_size(struct sockaddr *addr)
int rdma_addr_size(const struct sockaddr *addr)
{
switch (addr->sa_family) {
case AF_INET:
@ -315,18 +315,16 @@ static int dst_fetch_ha(const struct dst_entry *dst,
int ret = 0;
n = dst_neigh_lookup(dst, daddr);
if (!n)
return -ENODATA;
rcu_read_lock();
if (!n || !(n->nud_state & NUD_VALID)) {
if (n)
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
ret = -ENODATA;
} else {
rdma_copy_addr(dev_addr, dst->dev, n->ha);
}
rcu_read_unlock();
if (n)
neigh_release(n);
return ret;
@ -587,7 +585,7 @@ static void process_one_req(struct work_struct *_work)
spin_unlock_bh(&lock);
}
int rdma_resolve_ip(struct sockaddr *src_addr, struct sockaddr *dst_addr,
int rdma_resolve_ip(struct sockaddr *src_addr, const struct sockaddr *dst_addr,
struct rdma_dev_addr *addr, int timeout_ms,
void (*callback)(int status, struct sockaddr *src_addr,
struct rdma_dev_addr *addr, void *context),

File diff suppressed because it is too large Load Diff

View File

@ -474,7 +474,7 @@ static int cm_init_av_for_lap(struct cm_port *port, struct ib_wc *wc,
if (ret)
return ret;
memcpy(&av->ah_attr, &new_ah_attr, sizeof(new_ah_attr));
rdma_move_ah_attr(&av->ah_attr, &new_ah_attr);
return 0;
}
@ -508,31 +508,50 @@ static int add_cm_id_to_port_list(struct cm_id_private *cm_id_priv,
return ret;
}
static struct cm_port *get_cm_port_from_path(struct sa_path_rec *path)
static struct cm_port *
get_cm_port_from_path(struct sa_path_rec *path, const struct ib_gid_attr *attr)
{
struct cm_device *cm_dev;
struct cm_port *port = NULL;
unsigned long flags;
u8 p;
struct net_device *ndev = ib_get_ndev_from_path(path);
if (attr) {
read_lock_irqsave(&cm.device_lock, flags);
list_for_each_entry(cm_dev, &cm.device_list, list) {
if (!ib_find_cached_gid(cm_dev->ib_device, &path->sgid,
sa_conv_pathrec_to_gid_type(path),
ndev, &p, NULL)) {
port = cm_dev->port[p - 1];
if (cm_dev->ib_device == attr->device) {
port = cm_dev->port[attr->port_num - 1];
break;
}
}
read_unlock_irqrestore(&cm.device_lock, flags);
if (ndev)
dev_put(ndev);
} else {
/* SGID attribute can be NULL in following
* conditions.
* (a) Alternative path
* (b) IB link layer without GRH
* (c) LAP send messages
*/
read_lock_irqsave(&cm.device_lock, flags);
list_for_each_entry(cm_dev, &cm.device_list, list) {
attr = rdma_find_gid(cm_dev->ib_device,
&path->sgid,
sa_conv_pathrec_to_gid_type(path),
NULL);
if (!IS_ERR(attr)) {
port = cm_dev->port[attr->port_num - 1];
break;
}
}
read_unlock_irqrestore(&cm.device_lock, flags);
if (port)
rdma_put_gid_attr(attr);
}
return port;
}
static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av,
static int cm_init_av_by_path(struct sa_path_rec *path,
const struct ib_gid_attr *sgid_attr,
struct cm_av *av,
struct cm_id_private *cm_id_priv)
{
struct rdma_ah_attr new_ah_attr;
@ -540,7 +559,7 @@ static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av,
struct cm_port *port;
int ret;
port = get_cm_port_from_path(path);
port = get_cm_port_from_path(path, sgid_attr);
if (!port)
return -EINVAL;
cm_dev = port->cm_dev;
@ -554,22 +573,26 @@ static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av,
/*
* av->ah_attr might be initialized based on wc or during
* request processing time. So initialize a new ah_attr on stack.
* request processing time which might have reference to sgid_attr.
* So initialize a new ah_attr on stack.
* If initialization fails, old ah_attr is used for sending any
* responses. If initialization is successful, than new ah_attr
* is used by overwriting the old one.
* is used by overwriting the old one. So that right ah_attr
* can be used to return an error response.
*/
ret = ib_init_ah_attr_from_path(cm_dev->ib_device, port->port_num, path,
&new_ah_attr);
&new_ah_attr, sgid_attr);
if (ret)
return ret;
av->timeout = path->packet_life_time + 1;
ret = add_cm_id_to_port_list(cm_id_priv, av, port);
if (ret)
if (ret) {
rdma_destroy_ah_attr(&new_ah_attr);
return ret;
memcpy(&av->ah_attr, &new_ah_attr, sizeof(new_ah_attr));
}
rdma_move_ah_attr(&av->ah_attr, &new_ah_attr);
return 0;
}
@ -1091,6 +1114,9 @@ retest:
wait_for_completion(&cm_id_priv->comp);
while ((work = cm_dequeue_work(cm_id_priv)) != NULL)
cm_free_work(work);
rdma_destroy_ah_attr(&cm_id_priv->av.ah_attr);
rdma_destroy_ah_attr(&cm_id_priv->alt_av.ah_attr);
kfree(cm_id_priv->private_data);
kfree(cm_id_priv);
}
@ -1230,14 +1256,12 @@ new_id:
}
EXPORT_SYMBOL(ib_cm_insert_listen);
static __be64 cm_form_tid(struct cm_id_private *cm_id_priv,
enum cm_msg_sequence msg_seq)
static __be64 cm_form_tid(struct cm_id_private *cm_id_priv)
{
u64 hi_tid, low_tid;
hi_tid = ((u64) cm_id_priv->av.port->mad_agent->hi_tid) << 32;
low_tid = (u64) ((__force u32)cm_id_priv->id.local_id |
(msg_seq << 30));
low_tid = (u64)cm_id_priv->id.local_id;
return cpu_to_be64(hi_tid | low_tid);
}
@ -1265,7 +1289,7 @@ static void cm_format_req(struct cm_req_msg *req_msg,
pri_path->opa.slid);
cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_REQ));
cm_form_tid(cm_id_priv));
req_msg->local_comm_id = cm_id_priv->id.local_id;
req_msg->service_id = param->service_id;
@ -1413,12 +1437,13 @@ int ib_send_cm_req(struct ib_cm_id *cm_id,
goto out;
}
ret = cm_init_av_by_path(param->primary_path, &cm_id_priv->av,
ret = cm_init_av_by_path(param->primary_path,
param->ppath_sgid_attr, &cm_id_priv->av,
cm_id_priv);
if (ret)
goto error1;
if (param->alternate_path) {
ret = cm_init_av_by_path(param->alternate_path,
ret = cm_init_av_by_path(param->alternate_path, NULL,
&cm_id_priv->alt_av, cm_id_priv);
if (ret)
goto error1;
@ -1646,7 +1671,7 @@ static void cm_opa_to_ib_sgid(struct cm_work *work,
(ib_is_opa_gid(&path->sgid))) {
union ib_gid sgid;
if (ib_get_cached_gid(dev, port_num, 0, &sgid, NULL)) {
if (rdma_query_gid(dev, port_num, 0, &sgid)) {
dev_warn(&dev->dev,
"Error updating sgid in CM request\n");
return;
@ -1691,6 +1716,7 @@ static void cm_format_req_event(struct cm_work *work,
param->retry_count = cm_req_get_retry_count(req_msg);
param->rnr_retry_count = cm_req_get_rnr_retry_count(req_msg);
param->srq = cm_req_get_srq(req_msg);
param->ppath_sgid_attr = cm_id_priv->av.ah_attr.grh.sgid_attr;
work->cm_event.private_data = &req_msg->private_data;
}
@ -1914,9 +1940,8 @@ static int cm_req_handler(struct cm_work *work)
struct ib_cm_id *cm_id;
struct cm_id_private *cm_id_priv, *listen_cm_id_priv;
struct cm_req_msg *req_msg;
union ib_gid gid;
struct ib_gid_attr gid_attr;
const struct ib_global_route *grh;
const struct ib_gid_attr *gid_attr;
int ret;
req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
@ -1961,24 +1986,13 @@ static int cm_req_handler(struct cm_work *work)
if (cm_req_has_alt_path(req_msg))
memset(&work->path[1], 0, sizeof(work->path[1]));
grh = rdma_ah_read_grh(&cm_id_priv->av.ah_attr);
ret = ib_get_cached_gid(work->port->cm_dev->ib_device,
work->port->port_num,
grh->sgid_index,
&gid, &gid_attr);
if (ret) {
ib_send_cm_rej(cm_id, IB_CM_REJ_UNSUPPORTED, NULL, 0, NULL, 0);
goto rejected;
}
gid_attr = grh->sgid_attr;
if (gid_attr.ndev) {
if (gid_attr && gid_attr->ndev) {
work->path[0].rec_type =
sa_conv_gid_to_pathrec_type(gid_attr.gid_type);
sa_path_set_ifindex(&work->path[0],
gid_attr.ndev->ifindex);
sa_path_set_ndev(&work->path[0],
dev_net(gid_attr.ndev));
dev_put(gid_attr.ndev);
sa_conv_gid_to_pathrec_type(gid_attr->gid_type);
} else {
/* If no GID attribute or ndev is null, it is not RoCE. */
cm_path_set_rec_type(work->port->cm_dev->ib_device,
work->port->port_num,
&work->path[0],
@ -1992,15 +2006,14 @@ static int cm_req_handler(struct cm_work *work)
sa_path_set_dmac(&work->path[0],
cm_id_priv->av.ah_attr.roce.dmac);
work->path[0].hop_limit = grh->hop_limit;
ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av,
ret = cm_init_av_by_path(&work->path[0], gid_attr, &cm_id_priv->av,
cm_id_priv);
if (ret) {
int err;
err = ib_get_cached_gid(work->port->cm_dev->ib_device,
err = rdma_query_gid(work->port->cm_dev->ib_device,
work->port->port_num, 0,
&work->path[0].sgid,
NULL);
&work->path[0].sgid);
if (err)
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
NULL, 0, NULL, 0);
@ -2012,8 +2025,8 @@ static int cm_req_handler(struct cm_work *work)
goto rejected;
}
if (cm_req_has_alt_path(req_msg)) {
ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av,
cm_id_priv);
ret = cm_init_av_by_path(&work->path[1], NULL,
&cm_id_priv->alt_av, cm_id_priv);
if (ret) {
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID,
&work->path[0].sgid,
@ -2451,7 +2464,7 @@ static void cm_format_dreq(struct cm_dreq_msg *dreq_msg,
u8 private_data_len)
{
cm_format_mad_hdr(&dreq_msg->hdr, CM_DREQ_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_DREQ));
cm_form_tid(cm_id_priv));
dreq_msg->local_comm_id = cm_id_priv->id.local_id;
dreq_msg->remote_comm_id = cm_id_priv->id.remote_id;
cm_dreq_set_remote_qpn(dreq_msg, cm_id_priv->remote_qpn);
@ -3082,7 +3095,7 @@ static void cm_format_lap(struct cm_lap_msg *lap_msg,
alt_ext = opa_is_extended_lid(alternate_path->opa.dlid,
alternate_path->opa.slid);
cm_format_mad_hdr(&lap_msg->hdr, CM_LAP_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_LAP));
cm_form_tid(cm_id_priv));
lap_msg->local_comm_id = cm_id_priv->id.local_id;
lap_msg->remote_comm_id = cm_id_priv->id.remote_id;
cm_lap_set_remote_qpn(lap_msg, cm_id_priv->remote_qpn);
@ -3136,7 +3149,7 @@ int ib_send_cm_lap(struct ib_cm_id *cm_id,
goto out;
}
ret = cm_init_av_by_path(alternate_path, &cm_id_priv->alt_av,
ret = cm_init_av_by_path(alternate_path, NULL, &cm_id_priv->alt_av,
cm_id_priv);
if (ret)
goto out;
@ -3279,7 +3292,7 @@ static int cm_lap_handler(struct cm_work *work)
if (ret)
goto unlock;
cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av,
cm_init_av_by_path(param->alternate_path, NULL, &cm_id_priv->alt_av,
cm_id_priv);
cm_id_priv->id.lap_state = IB_CM_LAP_RCVD;
cm_id_priv->tid = lap_msg->hdr.tid;
@ -3458,7 +3471,7 @@ static void cm_format_sidr_req(struct cm_sidr_req_msg *sidr_req_msg,
struct ib_cm_sidr_req_param *param)
{
cm_format_mad_hdr(&sidr_req_msg->hdr, CM_SIDR_REQ_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_SIDR));
cm_form_tid(cm_id_priv));
sidr_req_msg->request_id = cm_id_priv->id.local_id;
sidr_req_msg->pkey = param->path->pkey;
sidr_req_msg->service_id = param->service_id;
@ -3481,7 +3494,9 @@ int ib_send_cm_sidr_req(struct ib_cm_id *cm_id,
return -EINVAL;
cm_id_priv = container_of(cm_id, struct cm_id_private, id);
ret = cm_init_av_by_path(param->path, &cm_id_priv->av, cm_id_priv);
ret = cm_init_av_by_path(param->path, param->sgid_attr,
&cm_id_priv->av,
cm_id_priv);
if (ret)
goto out;
@ -3518,6 +3533,7 @@ out:
EXPORT_SYMBOL(ib_send_cm_sidr_req);
static void cm_format_sidr_req_event(struct cm_work *work,
const struct cm_id_private *rx_cm_id,
struct ib_cm_id *listen_id)
{
struct cm_sidr_req_msg *sidr_req_msg;
@ -3531,6 +3547,7 @@ static void cm_format_sidr_req_event(struct cm_work *work,
param->service_id = sidr_req_msg->service_id;
param->bth_pkey = cm_get_bth_pkey(work);
param->port = work->port->port_num;
param->sgid_attr = rx_cm_id->av.ah_attr.grh.sgid_attr;
work->cm_event.private_data = &sidr_req_msg->private_data;
}
@ -3588,7 +3605,7 @@ static int cm_sidr_req_handler(struct cm_work *work)
cm_id_priv->id.service_id = sidr_req_msg->service_id;
cm_id_priv->id.service_mask = ~cpu_to_be64(0);
cm_format_sidr_req_event(work, &cur_cm_id_priv->id);
cm_format_sidr_req_event(work, cm_id_priv, &cur_cm_id_priv->id);
cm_process_work(cm_id_priv, work);
cm_deref_id(cur_cm_id_priv);
return 0;
@ -3665,7 +3682,8 @@ error: spin_unlock_irqrestore(&cm_id_priv->lock, flags);
}
EXPORT_SYMBOL(ib_send_cm_sidr_rep);
static void cm_format_sidr_rep_event(struct cm_work *work)
static void cm_format_sidr_rep_event(struct cm_work *work,
const struct cm_id_private *cm_id_priv)
{
struct cm_sidr_rep_msg *sidr_rep_msg;
struct ib_cm_sidr_rep_event_param *param;
@ -3678,6 +3696,7 @@ static void cm_format_sidr_rep_event(struct cm_work *work)
param->qpn = be32_to_cpu(cm_sidr_rep_get_qpn(sidr_rep_msg));
param->info = &sidr_rep_msg->info;
param->info_len = sidr_rep_msg->info_length;
param->sgid_attr = cm_id_priv->av.ah_attr.grh.sgid_attr;
work->cm_event.private_data = &sidr_rep_msg->private_data;
}
@ -3701,7 +3720,7 @@ static int cm_sidr_rep_handler(struct cm_work *work)
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
spin_unlock_irq(&cm_id_priv->lock);
cm_format_sidr_rep_event(work);
cm_format_sidr_rep_event(work, cm_id_priv);
cm_process_work(cm_id_priv, work);
return 0;
out:

View File

@ -44,13 +44,6 @@
#define IB_CM_CLASS_VERSION 2 /* IB specification 1.2 */
enum cm_msg_sequence {
CM_MSG_SEQUENCE_REQ,
CM_MSG_SEQUENCE_LAP,
CM_MSG_SEQUENCE_DREQ,
CM_MSG_SEQUENCE_SIDR
};
struct cm_req_msg {
struct ib_mad_hdr hdr;

View File

@ -366,7 +366,6 @@ struct cma_multicast {
void *context;
struct sockaddr_storage addr;
struct kref mcref;
bool igmp_joined;
u8 join_state;
};
@ -412,11 +411,11 @@ struct cma_req_info {
struct sockaddr_storage listen_addr_storage;
struct sockaddr_storage src_addr_storage;
struct ib_device *device;
int port;
union ib_gid local_gid;
__be64 service_id;
int port;
bool has_gid;
u16 pkey;
bool has_gid:1;
};
static int cma_comp(struct rdma_id_private *id_priv, enum rdma_cm_state comp)
@ -491,12 +490,10 @@ static void _cma_attach_to_dev(struct rdma_id_private *id_priv,
{
cma_ref_dev(cma_dev);
id_priv->cma_dev = cma_dev;
id_priv->gid_type = 0;
id_priv->id.device = cma_dev->device;
id_priv->id.route.addr.dev_addr.transport =
rdma_node_get_transport(cma_dev->device->node_type);
list_add_tail(&id_priv->list, &cma_dev->id_list);
id_priv->res.type = RDMA_RESTRACK_CM_ID;
rdma_restrack_add(&id_priv->res);
}
@ -603,46 +600,53 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
return ret;
}
static inline int cma_validate_port(struct ib_device *device, u8 port,
static const struct ib_gid_attr *
cma_validate_port(struct ib_device *device, u8 port,
enum ib_gid_type gid_type,
union ib_gid *gid,
struct rdma_id_private *id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
int bound_if_index = dev_addr->bound_dev_if;
const struct ib_gid_attr *sgid_attr;
int dev_type = dev_addr->dev_type;
struct net_device *ndev = NULL;
int ret = -ENODEV;
if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
return ret;
return ERR_PTR(-ENODEV);
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
return ret;
return ERR_PTR(-ENODEV);
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(dev_addr->net, bound_if_index);
if (!ndev)
return ret;
return ERR_PTR(-ENODEV);
} else {
gid_type = IB_GID_TYPE_IB;
}
ret = ib_find_cached_gid_by_port(device, gid, gid_type, port,
ndev, NULL);
sgid_attr = rdma_find_gid_by_port(device, gid, gid_type, port, ndev);
if (ndev)
dev_put(ndev);
return sgid_attr;
}
return ret;
static void cma_bind_sgid_attr(struct rdma_id_private *id_priv,
const struct ib_gid_attr *sgid_attr)
{
WARN_ON(id_priv->id.route.addr.dev_addr.sgid_attr);
id_priv->id.route.addr.dev_addr.sgid_attr = sgid_attr;
}
static int cma_acquire_dev(struct rdma_id_private *id_priv,
struct rdma_id_private *listen_id_priv)
const struct rdma_id_private *listen_id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
const struct ib_gid_attr *sgid_attr;
struct cma_device *cma_dev;
union ib_gid gid, iboe_gid, *gidp;
enum ib_gid_type gid_type;
int ret = -ENODEV;
u8 port;
@ -662,14 +666,13 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
port = listen_id_priv->id.port_num;
gidp = rdma_protocol_roce(cma_dev->device, port) ?
&iboe_gid : &gid;
ret = cma_validate_port(cma_dev->device, port,
rdma_protocol_ib(cma_dev->device, port) ?
IB_GID_TYPE_IB :
listen_id_priv->gid_type, gidp,
id_priv);
if (!ret) {
gid_type = listen_id_priv->gid_type;
sgid_attr = cma_validate_port(cma_dev->device, port,
gid_type, gidp, id_priv);
if (!IS_ERR(sgid_attr)) {
id_priv->id.port_num = port;
cma_bind_sgid_attr(id_priv, sgid_attr);
ret = 0;
goto out;
}
}
@ -683,14 +686,13 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
gidp = rdma_protocol_roce(cma_dev->device, port) ?
&iboe_gid : &gid;
ret = cma_validate_port(cma_dev->device, port,
rdma_protocol_ib(cma_dev->device, port) ?
IB_GID_TYPE_IB :
cma_dev->default_gid_type[port - 1],
gidp, id_priv);
if (!ret) {
gid_type = cma_dev->default_gid_type[port - 1];
sgid_attr = cma_validate_port(cma_dev->device, port,
gid_type, gidp, id_priv);
if (!IS_ERR(sgid_attr)) {
id_priv->id.port_num = port;
cma_bind_sgid_attr(id_priv, sgid_attr);
ret = 0;
goto out;
}
}
@ -732,8 +734,8 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
if (ib_get_cached_port_state(cur_dev->device, p, &port_state))
continue;
for (i = 0; !ib_get_cached_gid(cur_dev->device, p, i,
&gid, NULL);
for (i = 0; !rdma_query_gid(cur_dev->device,
p, i, &gid);
i++) {
if (!memcmp(&gid, dgid, sizeof(gid))) {
cma_dev = cur_dev;
@ -785,12 +787,14 @@ struct rdma_cm_id *__rdma_create_id(struct net *net,
id_priv->res.kern_name = caller;
else
rdma_restrack_set_task(&id_priv->res, current);
id_priv->res.type = RDMA_RESTRACK_CM_ID;
id_priv->state = RDMA_CM_IDLE;
id_priv->id.context = context;
id_priv->id.event_handler = event_handler;
id_priv->id.ps = ps;
id_priv->id.qp_type = qp_type;
id_priv->tos_set = false;
id_priv->gid_type = IB_GID_TYPE_IB;
spin_lock_init(&id_priv->lock);
mutex_init(&id_priv->qp_mutex);
init_completion(&id_priv->comp);
@ -1036,35 +1040,38 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
}
EXPORT_SYMBOL(rdma_init_qp_attr);
static inline int cma_zero_addr(struct sockaddr *addr)
static inline bool cma_zero_addr(const struct sockaddr *addr)
{
switch (addr->sa_family) {
case AF_INET:
return ipv4_is_zeronet(((struct sockaddr_in *)addr)->sin_addr.s_addr);
case AF_INET6:
return ipv6_addr_any(&((struct sockaddr_in6 *) addr)->sin6_addr);
return ipv6_addr_any(&((struct sockaddr_in6 *)addr)->sin6_addr);
case AF_IB:
return ib_addr_any(&((struct sockaddr_ib *) addr)->sib_addr);
return ib_addr_any(&((struct sockaddr_ib *)addr)->sib_addr);
default:
return 0;
return false;
}
}
static inline int cma_loopback_addr(struct sockaddr *addr)
static inline bool cma_loopback_addr(const struct sockaddr *addr)
{
switch (addr->sa_family) {
case AF_INET:
return ipv4_is_loopback(((struct sockaddr_in *) addr)->sin_addr.s_addr);
return ipv4_is_loopback(
((struct sockaddr_in *)addr)->sin_addr.s_addr);
case AF_INET6:
return ipv6_addr_loopback(&((struct sockaddr_in6 *) addr)->sin6_addr);
return ipv6_addr_loopback(
&((struct sockaddr_in6 *)addr)->sin6_addr);
case AF_IB:
return ib_addr_loopback(&((struct sockaddr_ib *) addr)->sib_addr);
return ib_addr_loopback(
&((struct sockaddr_ib *)addr)->sib_addr);
default:
return 0;
return false;
}
}
static inline int cma_any_addr(struct sockaddr *addr)
static inline bool cma_any_addr(const struct sockaddr *addr)
{
return cma_zero_addr(addr) || cma_loopback_addr(addr);
}
@ -1087,7 +1094,7 @@ static int cma_addr_cmp(struct sockaddr *src, struct sockaddr *dst)
}
}
static __be16 cma_port(struct sockaddr *addr)
static __be16 cma_port(const struct sockaddr *addr)
{
struct sockaddr_ib *sib;
@ -1105,15 +1112,15 @@ static __be16 cma_port(struct sockaddr *addr)
}
}
static inline int cma_any_port(struct sockaddr *addr)
static inline int cma_any_port(const struct sockaddr *addr)
{
return !cma_port(addr);
}
static void cma_save_ib_info(struct sockaddr *src_addr,
struct sockaddr *dst_addr,
struct rdma_cm_id *listen_id,
struct sa_path_rec *path)
const struct rdma_cm_id *listen_id,
const struct sa_path_rec *path)
{
struct sockaddr_ib *listen_ib, *ib;
@ -1198,7 +1205,7 @@ static u16 cma_port_from_service_id(__be64 service_id)
static int cma_save_ip_info(struct sockaddr *src_addr,
struct sockaddr *dst_addr,
struct ib_cm_event *ib_event,
const struct ib_cm_event *ib_event,
__be64 service_id)
{
struct cma_hdr *hdr;
@ -1228,8 +1235,8 @@ static int cma_save_ip_info(struct sockaddr *src_addr,
static int cma_save_net_info(struct sockaddr *src_addr,
struct sockaddr *dst_addr,
struct rdma_cm_id *listen_id,
struct ib_cm_event *ib_event,
const struct rdma_cm_id *listen_id,
const struct ib_cm_event *ib_event,
sa_family_t sa_family, __be64 service_id)
{
if (sa_family == AF_IB) {
@ -1361,7 +1368,23 @@ static bool validate_net_dev(struct net_device *net_dev,
}
}
static struct net_device *cma_get_net_dev(struct ib_cm_event *ib_event,
static struct net_device *
roce_get_net_dev_by_cm_event(const struct ib_cm_event *ib_event)
{
const struct ib_gid_attr *sgid_attr = NULL;
if (ib_event->event == IB_CM_REQ_RECEIVED)
sgid_attr = ib_event->param.req_rcvd.ppath_sgid_attr;
else if (ib_event->event == IB_CM_SIDR_REQ_RECEIVED)
sgid_attr = ib_event->param.sidr_req_rcvd.sgid_attr;
if (!sgid_attr)
return NULL;
dev_hold(sgid_attr->ndev);
return sgid_attr->ndev;
}
static struct net_device *cma_get_net_dev(const struct ib_cm_event *ib_event,
struct cma_req_info *req)
{
struct sockaddr *listen_addr =
@ -1376,7 +1399,11 @@ static struct net_device *cma_get_net_dev(struct ib_cm_event *ib_event,
if (err)
return ERR_PTR(err);
net_dev = ib_get_net_dev_by_params(req->device, req->port, req->pkey,
if (rdma_protocol_roce(req->device, req->port))
net_dev = roce_get_net_dev_by_cm_event(ib_event);
else
net_dev = ib_get_net_dev_by_params(req->device, req->port,
req->pkey,
gid, listen_addr);
if (!net_dev)
return ERR_PTR(-ENODEV);
@ -1440,14 +1467,20 @@ static bool cma_match_net_dev(const struct rdma_cm_id *id,
const struct rdma_addr *addr = &id->route.addr;
if (!net_dev)
/* This request is an AF_IB request or a RoCE request */
/* This request is an AF_IB request */
return (!id->port_num || id->port_num == port_num) &&
(addr->src_addr.ss_family == AF_IB ||
rdma_protocol_roce(id->device, port_num));
(addr->src_addr.ss_family == AF_IB);
return !addr->dev_addr.bound_dev_if ||
(net_eq(dev_net(net_dev), addr->dev_addr.net) &&
addr->dev_addr.bound_dev_if == net_dev->ifindex);
/*
* Net namespaces must match, and if the listner is listening
* on a specific netdevice than netdevice must match as well.
*/
if (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
(!!addr->dev_addr.bound_dev_if ==
(addr->dev_addr.bound_dev_if == net_dev->ifindex)))
return true;
else
return false;
}
static struct rdma_id_private *cma_find_listener(
@ -1480,8 +1513,9 @@ static struct rdma_id_private *cma_find_listener(
return ERR_PTR(-EINVAL);
}
static struct rdma_id_private *cma_id_from_event(struct ib_cm_id *cm_id,
struct ib_cm_event *ib_event,
static struct rdma_id_private *
cma_ib_id_from_event(struct ib_cm_id *cm_id,
const struct ib_cm_event *ib_event,
struct net_device **net_dev)
{
struct cma_req_info req;
@ -1498,10 +1532,6 @@ static struct rdma_id_private *cma_id_from_event(struct ib_cm_id *cm_id,
if (PTR_ERR(*net_dev) == -EAFNOSUPPORT) {
/* Assuming the protocol is AF_IB */
*net_dev = NULL;
} else if (rdma_protocol_roce(req.device, req.port)) {
/* TODO find the net dev matching the request parameters
* through the RoCE GID table */
*net_dev = NULL;
} else {
return ERR_CAST(*net_dev);
}
@ -1629,6 +1659,21 @@ static void cma_release_port(struct rdma_id_private *id_priv)
mutex_unlock(&lock);
}
static void cma_leave_roce_mc_group(struct rdma_id_private *id_priv,
struct cma_multicast *mc)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct net_device *ndev = NULL;
if (dev_addr->bound_dev_if)
ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if);
if (ndev) {
cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid, false);
dev_put(ndev);
}
kref_put(&mc->mcref, release_mc);
}
static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
{
struct cma_multicast *mc;
@ -1642,22 +1687,7 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else {
if (mc->igmp_joined) {
struct rdma_dev_addr *dev_addr =
&id_priv->id.route.addr.dev_addr;
struct net_device *ndev = NULL;
if (dev_addr->bound_dev_if)
ndev = dev_get_by_index(&init_net,
dev_addr->bound_dev_if);
if (ndev) {
cma_igmp_send(ndev,
&mc->multicast.ib->rec.mgid,
false);
dev_put(ndev);
}
}
kref_put(&mc->mcref, release_mc);
cma_leave_roce_mc_group(id_priv, mc);
}
}
}
@ -1699,6 +1729,10 @@ void rdma_destroy_id(struct rdma_cm_id *id)
cma_deref_id(id_priv->id.context);
kfree(id_priv->id.route.path_rec);
if (id_priv->id.route.addr.dev_addr.sgid_attr)
rdma_put_gid_attr(id_priv->id.route.addr.dev_addr.sgid_attr);
put_net(id_priv->id.route.addr.dev_addr.net);
kfree(id_priv);
}
@ -1730,7 +1764,7 @@ reject:
}
static void cma_set_rep_event_data(struct rdma_cm_event *event,
struct ib_cm_rep_event_param *rep_data,
const struct ib_cm_rep_event_param *rep_data,
void *private_data)
{
event->param.conn.private_data = private_data;
@ -1743,10 +1777,11 @@ static void cma_set_rep_event_data(struct rdma_cm_event *event,
event->param.conn.qp_num = rep_data->remote_qpn;
}
static int cma_ib_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
static int cma_ib_handler(struct ib_cm_id *cm_id,
const struct ib_cm_event *ib_event)
{
struct rdma_id_private *id_priv = cm_id->context;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
int ret = 0;
mutex_lock(&id_priv->handler_mutex);
@ -1756,7 +1791,6 @@ static int cma_ib_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
id_priv->state != RDMA_CM_DISCONNECT))
goto out;
memset(&event, 0, sizeof event);
switch (ib_event->event) {
case IB_CM_REQ_ERROR:
case IB_CM_REP_ERROR:
@ -1825,8 +1859,9 @@ out:
return ret;
}
static struct rdma_id_private *cma_new_conn_id(struct rdma_cm_id *listen_id,
struct ib_cm_event *ib_event,
static struct rdma_id_private *
cma_ib_new_conn_id(const struct rdma_cm_id *listen_id,
const struct ib_cm_event *ib_event,
struct net_device *net_dev)
{
struct rdma_id_private *listen_id_priv;
@ -1888,11 +1923,12 @@ err:
return NULL;
}
static struct rdma_id_private *cma_new_udp_id(struct rdma_cm_id *listen_id,
struct ib_cm_event *ib_event,
static struct rdma_id_private *
cma_ib_new_udp_id(const struct rdma_cm_id *listen_id,
const struct ib_cm_event *ib_event,
struct net_device *net_dev)
{
struct rdma_id_private *listen_id_priv;
const struct rdma_id_private *listen_id_priv;
struct rdma_id_private *id_priv;
struct rdma_cm_id *id;
const sa_family_t ss_family = listen_id->route.addr.src_addr.ss_family;
@ -1932,7 +1968,7 @@ err:
}
static void cma_set_req_event_data(struct rdma_cm_event *event,
struct ib_cm_req_event_param *req_data,
const struct ib_cm_req_event_param *req_data,
void *private_data, int offset)
{
event->param.conn.private_data = private_data + offset;
@ -1946,7 +1982,8 @@ static void cma_set_req_event_data(struct rdma_cm_event *event,
event->param.conn.qp_num = req_data->remote_qpn;
}
static int cma_check_req_qp_type(struct rdma_cm_id *id, struct ib_cm_event *ib_event)
static int cma_ib_check_req_qp_type(const struct rdma_cm_id *id,
const struct ib_cm_event *ib_event)
{
return (((ib_event->event == IB_CM_REQ_RECEIVED) &&
(ib_event->param.req_rcvd.qp_type == id->qp_type)) ||
@ -1955,19 +1992,20 @@ static int cma_check_req_qp_type(struct rdma_cm_id *id, struct ib_cm_event *ib_e
(!id->qp_type));
}
static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
static int cma_ib_req_handler(struct ib_cm_id *cm_id,
const struct ib_cm_event *ib_event)
{
struct rdma_id_private *listen_id, *conn_id = NULL;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
struct net_device *net_dev;
u8 offset;
int ret;
listen_id = cma_id_from_event(cm_id, ib_event, &net_dev);
listen_id = cma_ib_id_from_event(cm_id, ib_event, &net_dev);
if (IS_ERR(listen_id))
return PTR_ERR(listen_id);
if (!cma_check_req_qp_type(&listen_id->id, ib_event)) {
if (!cma_ib_check_req_qp_type(&listen_id->id, ib_event)) {
ret = -EINVAL;
goto net_dev_put;
}
@ -1978,16 +2016,15 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
goto err1;
}
memset(&event, 0, sizeof event);
offset = cma_user_data_offset(listen_id);
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
if (ib_event->event == IB_CM_SIDR_REQ_RECEIVED) {
conn_id = cma_new_udp_id(&listen_id->id, ib_event, net_dev);
conn_id = cma_ib_new_udp_id(&listen_id->id, ib_event, net_dev);
event.param.ud.private_data = ib_event->private_data + offset;
event.param.ud.private_data_len =
IB_CM_SIDR_REQ_PRIVATE_DATA_SIZE - offset;
} else {
conn_id = cma_new_conn_id(&listen_id->id, ib_event, net_dev);
conn_id = cma_ib_new_conn_id(&listen_id->id, ib_event, net_dev);
cma_set_req_event_data(&event, &ib_event->param.req_rcvd,
ib_event->private_data, offset);
}
@ -2087,7 +2124,7 @@ EXPORT_SYMBOL(rdma_read_gids);
static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
{
struct rdma_id_private *id_priv = iw_id->context;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
int ret = 0;
struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
@ -2096,7 +2133,6 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
if (id_priv->state != RDMA_CM_CONNECT)
goto out;
memset(&event, 0, sizeof event);
switch (iw_event->event) {
case IW_CM_EVENT_CLOSE:
event.event = RDMA_CM_EVENT_DISCONNECTED;
@ -2156,11 +2192,17 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
{
struct rdma_cm_id *new_cm_id;
struct rdma_id_private *listen_id, *conn_id;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
int ret = -ECONNABORTED;
struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
event.param.conn.private_data = iw_event->private_data;
event.param.conn.private_data_len = iw_event->private_data_len;
event.param.conn.initiator_depth = iw_event->ird;
event.param.conn.responder_resources = iw_event->ord;
listen_id = cm_id->context;
mutex_lock(&listen_id->handler_mutex);
@ -2202,13 +2244,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr));
memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr));
memset(&event, 0, sizeof event);
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
event.param.conn.private_data = iw_event->private_data;
event.param.conn.private_data_len = iw_event->private_data_len;
event.param.conn.initiator_depth = iw_event->ird;
event.param.conn.responder_resources = iw_event->ord;
/*
* Protect against the user destroying conn_id from another thread
* until we're done accessing it.
@ -2241,7 +2276,8 @@ static int cma_ib_listen(struct rdma_id_private *id_priv)
addr = cma_src_addr(id_priv);
svc_id = rdma_get_service_id(&id_priv->id, addr);
id = ib_cm_insert_listen(id_priv->id.device, cma_req_handler, svc_id);
id = ib_cm_insert_listen(id_priv->id.device,
cma_ib_req_handler, svc_id);
if (IS_ERR(id))
return PTR_ERR(id);
id_priv->cm_id.ib = id;
@ -2561,8 +2597,6 @@ cma_iboe_set_path_rec_l2_fields(struct rdma_id_private *id_priv)
route->path_rec->rec_type = sa_conv_gid_to_pathrec_type(gid_type);
route->path_rec->roce.route_resolved = true;
sa_path_set_ndev(route->path_rec, addr->dev_addr.net);
sa_path_set_ifindex(route->path_rec, ndev->ifindex);
sa_path_set_dmac(route->path_rec, addr->dev_addr.dst_dev_addr);
return ndev;
}
@ -2791,7 +2825,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
p = 1;
port_found:
ret = ib_get_cached_gid(cma_dev->device, p, 0, &gid, NULL);
ret = rdma_query_gid(cma_dev->device, p, 0, &gid);
if (ret)
goto out;
@ -2817,9 +2851,8 @@ static void addr_handler(int status, struct sockaddr *src_addr,
struct rdma_dev_addr *dev_addr, void *context)
{
struct rdma_id_private *id_priv = context;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
memset(&event, 0, sizeof event);
mutex_lock(&id_priv->handler_mutex);
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_QUERY,
RDMA_CM_ADDR_RESOLVED))
@ -2910,7 +2943,7 @@ err:
}
static int cma_bind_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
struct sockaddr *dst_addr)
const struct sockaddr *dst_addr)
{
if (!src_addr || !src_addr->sa_family) {
src_addr = (struct sockaddr *) &id->route.addr.src_addr;
@ -2931,31 +2964,25 @@ static int cma_bind_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
}
int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
struct sockaddr *dst_addr, int timeout_ms)
const struct sockaddr *dst_addr, int timeout_ms)
{
struct rdma_id_private *id_priv;
int ret;
id_priv = container_of(id, struct rdma_id_private, id);
memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr));
if (id_priv->state == RDMA_CM_IDLE) {
ret = cma_bind_addr(id, src_addr, dst_addr);
if (ret) {
memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr));
if (ret)
return ret;
}
}
if (cma_family(id_priv) != dst_addr->sa_family) {
memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr));
if (cma_family(id_priv) != dst_addr->sa_family)
return -EINVAL;
}
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) {
memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr));
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY))
return -EINVAL;
}
memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr));
atomic_inc(&id_priv->refcount);
if (cma_any_addr(dst_addr)) {
ret = cma_resolve_loopback(id_priv);
@ -3451,18 +3478,18 @@ static int cma_format_hdr(void *hdr, struct rdma_id_private *id_priv)
}
static int cma_sidr_rep_handler(struct ib_cm_id *cm_id,
struct ib_cm_event *ib_event)
const struct ib_cm_event *ib_event)
{
struct rdma_id_private *id_priv = cm_id->context;
struct rdma_cm_event event;
struct ib_cm_sidr_rep_event_param *rep = &ib_event->param.sidr_rep_rcvd;
struct rdma_cm_event event = {};
const struct ib_cm_sidr_rep_event_param *rep =
&ib_event->param.sidr_rep_rcvd;
int ret = 0;
mutex_lock(&id_priv->handler_mutex);
if (id_priv->state != RDMA_CM_CONNECT)
goto out;
memset(&event, 0, sizeof event);
switch (ib_event->event) {
case IB_CM_SIDR_REQ_ERROR:
event.event = RDMA_CM_EVENT_UNREACHABLE;
@ -3488,7 +3515,8 @@ static int cma_sidr_rep_handler(struct ib_cm_id *cm_id,
ib_init_ah_attr_from_path(id_priv->id.device,
id_priv->id.port_num,
id_priv->id.route.path_rec,
&event.param.ud.ah_attr);
&event.param.ud.ah_attr,
rep->sgid_attr);
event.param.ud.qp_num = rep->qpn;
event.param.ud.qkey = rep->qkey;
event.event = RDMA_CM_EVENT_ESTABLISHED;
@ -3501,6 +3529,8 @@ static int cma_sidr_rep_handler(struct ib_cm_id *cm_id,
}
ret = id_priv->id.event_handler(&id_priv->id, &event);
rdma_destroy_ah_attr(&event.param.ud.ah_attr);
if (ret) {
/* Destroy the CM ID by returning a non-zero value. */
id_priv->cm_id.ib = NULL;
@ -3557,6 +3587,7 @@ static int cma_resolve_ib_udp(struct rdma_id_private *id_priv,
id_priv->cm_id.ib = id;
req.path = id_priv->id.route.path_rec;
req.sgid_attr = id_priv->id.route.addr.dev_addr.sgid_attr;
req.service_id = rdma_get_service_id(&id_priv->id, cma_dst_addr(id_priv));
req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8);
req.max_cm_retries = CMA_MAX_CM_RETRIES;
@ -3618,6 +3649,8 @@ static int cma_connect_ib(struct rdma_id_private *id_priv,
if (route->num_paths == 2)
req.alternate_path = &route->path_rec[1];
req.ppath_sgid_attr = id_priv->id.route.addr.dev_addr.sgid_attr;
/* Alternate path SGID attribute currently unsupported */
req.service_id = rdma_get_service_id(&id_priv->id, cma_dst_addr(id_priv));
req.qp_num = id_priv->qp_num;
req.qp_type = id_priv->id.qp_type;
@ -3928,7 +3961,7 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
{
struct rdma_id_private *id_priv;
struct cma_multicast *mc = multicast->context;
struct rdma_cm_event event;
struct rdma_cm_event event = {};
int ret = 0;
id_priv = mc->id_priv;
@ -3952,7 +3985,6 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
}
mutex_unlock(&id_priv->qp_mutex);
memset(&event, 0, sizeof event);
event.status = status;
event.param.ud.private_data = mc->context;
if (!status) {
@ -3981,6 +4013,8 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
event.event = RDMA_CM_EVENT_MULTICAST_ERROR;
ret = id_priv->id.event_handler(&id_priv->id, &event);
rdma_destroy_ah_attr(&event.param.ud.ah_attr);
if (ret) {
cma_exch(id_priv, RDMA_CM_DESTROYING);
mutex_unlock(&id_priv->handler_mutex);
@ -4010,7 +4044,7 @@ static void cma_set_mgid(struct rdma_id_private *id_priv,
memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
} else if (addr->sa_family == AF_IB) {
memcpy(mgid, &((struct sockaddr_ib *) addr)->sib_addr, sizeof *mgid);
} else if ((addr->sa_family == AF_INET6)) {
} else if (addr->sa_family == AF_INET6) {
ipv6_ib_mc_map(&sin6->sin6_addr, dev_addr->broadcast, mc_map);
if (id_priv->id.ps == RDMA_PS_UDP)
mc_map[7] = 0x01; /* Use RDMA CM signature */
@ -4168,8 +4202,6 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
if (!send_only) {
err = cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid,
true);
if (!err)
mc->igmp_joined = true;
}
}
} else {
@ -4221,26 +4253,29 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
memcpy(&mc->addr, addr, rdma_addr_size(addr));
mc->context = context;
mc->id_priv = id_priv;
mc->igmp_joined = false;
mc->join_state = join_state;
spin_lock(&id_priv->lock);
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);
if (rdma_protocol_roce(id->device, id->port_num)) {
kref_init(&mc->mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
} else if (rdma_cap_ib_mcast(id->device, id->port_num))
if (ret)
goto out_err;
} else if (rdma_cap_ib_mcast(id->device, id->port_num)) {
ret = cma_join_ib_multicast(id_priv, mc);
else
if (ret)
goto out_err;
} else {
ret = -ENOSYS;
if (ret) {
spin_lock_irq(&id_priv->lock);
list_del(&mc->list);
spin_unlock_irq(&id_priv->lock);
kfree(mc);
goto out_err;
}
spin_lock(&id_priv->lock);
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);
return 0;
out_err:
kfree(mc);
return ret;
}
EXPORT_SYMBOL(rdma_join_multicast);
@ -4268,23 +4303,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else if (rdma_protocol_roce(id->device, id->port_num)) {
if (mc->igmp_joined) {
struct rdma_dev_addr *dev_addr =
&id->route.addr.dev_addr;
struct net_device *ndev = NULL;
if (dev_addr->bound_dev_if)
ndev = dev_get_by_index(dev_addr->net,
dev_addr->bound_dev_if);
if (ndev) {
cma_igmp_send(ndev,
&mc->multicast.ib->rec.mgid,
false);
dev_put(ndev);
}
mc->igmp_joined = false;
}
kref_put(&mc->mcref, release_mc);
cma_leave_roce_mc_group(id_priv, mc);
}
return;
}
@ -4410,7 +4429,7 @@ free_cma_dev:
static int cma_remove_id_dev(struct rdma_id_private *id_priv)
{
struct rdma_cm_event event;
struct rdma_cm_event event = {};
enum rdma_cm_state state;
int ret = 0;
@ -4426,7 +4445,6 @@ static int cma_remove_id_dev(struct rdma_id_private *id_priv)
if (!cma_comp(id_priv, RDMA_CM_DEVICE_REMOVAL))
goto out;
memset(&event, 0, sizeof event);
event.event = RDMA_CM_EVENT_DEVICE_REMOVAL;
ret = id_priv->id.event_handler(&id_priv->id, &event);
out:

View File

@ -91,7 +91,7 @@ void ib_device_unregister_sysfs(struct ib_device *device);
typedef void (*roce_netdev_callback)(struct ib_device *device, u8 port,
struct net_device *idev, void *cookie);
typedef int (*roce_netdev_filter)(struct ib_device *device, u8 port,
typedef bool (*roce_netdev_filter)(struct ib_device *device, u8 port,
struct net_device *idev, void *cookie);
void ib_enum_roce_netdev(struct ib_device *ib_dev,

View File

@ -105,8 +105,6 @@ static int ib_device_check_mandatory(struct ib_device *device)
IB_MANDATORY_FUNC(query_pkey),
IB_MANDATORY_FUNC(alloc_pd),
IB_MANDATORY_FUNC(dealloc_pd),
IB_MANDATORY_FUNC(create_ah),
IB_MANDATORY_FUNC(destroy_ah),
IB_MANDATORY_FUNC(create_qp),
IB_MANDATORY_FUNC(modify_qp),
IB_MANDATORY_FUNC(destroy_qp),
@ -861,25 +859,6 @@ int ib_query_port(struct ib_device *device,
}
EXPORT_SYMBOL(ib_query_port);
/**
* ib_query_gid - Get GID table entry
* @device:Device to query
* @port_num:Port number to query
* @index:GID table index to query
* @gid:Returned GID
* @attr: Returned GID attributes related to this GID index (only in RoCE).
* NULL means ignore.
*
* ib_query_gid() fetches the specified GID table entry from the cache.
*/
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid,
struct ib_gid_attr *attr)
{
return ib_get_cached_gid(device, port_num, index, gid, attr);
}
EXPORT_SYMBOL(ib_query_gid);
/**
* ib_enum_roce_netdev - enumerate all RoCE ports
* @ib_dev : IB device we want to query
@ -1057,7 +1036,7 @@ int ib_find_gid(struct ib_device *device, union ib_gid *gid,
continue;
for (i = 0; i < device->port_immutable[port].gid_tbl_len; ++i) {
ret = ib_query_gid(device, port, i, &tmp_gid, NULL);
ret = rdma_query_gid(device, port, i, &tmp_gid);
if (ret)
return ret;
if (!memcmp(&tmp_gid, gid, sizeof *gid)) {

View File

@ -38,6 +38,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/dma-mapping.h>
#include <linux/idr.h>
#include <linux/slab.h>
#include <linux/module.h>
#include <linux/security.h>
@ -58,8 +59,13 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in number of work requests
module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests");
/*
* The mlx4 driver uses the top byte to distinguish which virtual function
* generated the MAD, so we must avoid using it.
*/
#define AGENT_ID_LIMIT (1 << 24)
static DEFINE_IDR(ib_mad_clients);
static struct list_head ib_mad_port_list;
static atomic_t ib_mad_client_id = ATOMIC_INIT(0);
/* Port list lock */
static DEFINE_SPINLOCK(ib_mad_port_list_lock);
@ -190,6 +196,8 @@ EXPORT_SYMBOL(ib_response_mad);
/*
* ib_register_mad_agent - Register to send/receive MADs
*
* Context: Process context.
*/
struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
u8 port_num,
@ -210,7 +218,6 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
struct ib_mad_mgmt_vendor_class *vendor_class;
struct ib_mad_mgmt_method_table *method;
int ret2, qpn;
unsigned long flags;
u8 mgmt_class, vclass;
/* Validate parameters */
@ -376,13 +383,24 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
goto error4;
}
spin_lock_irqsave(&port_priv->reg_lock, flags);
mad_agent_priv->agent.hi_tid = atomic_inc_return(&ib_mad_client_id);
idr_preload(GFP_KERNEL);
idr_lock(&ib_mad_clients);
ret2 = idr_alloc_cyclic(&ib_mad_clients, mad_agent_priv, 0,
AGENT_ID_LIMIT, GFP_ATOMIC);
idr_unlock(&ib_mad_clients);
idr_preload_end();
if (ret2 < 0) {
ret = ERR_PTR(ret2);
goto error5;
}
mad_agent_priv->agent.hi_tid = ret2;
/*
* Make sure MAD registration (if supplied)
* is non overlapping with any existing ones
*/
spin_lock_irq(&port_priv->reg_lock);
if (mad_reg_req) {
mgmt_class = convert_mgmt_class(mad_reg_req->mgmt_class);
if (!is_vendor_class(mgmt_class)) {
@ -393,7 +411,7 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
if (method) {
if (method_in_use(&method,
mad_reg_req))
goto error5;
goto error6;
}
}
ret2 = add_nonoui_reg_req(mad_reg_req, mad_agent_priv,
@ -409,24 +427,25 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
if (is_vendor_method_in_use(
vendor_class,
mad_reg_req))
goto error5;
goto error6;
}
}
ret2 = add_oui_reg_req(mad_reg_req, mad_agent_priv);
}
if (ret2) {
ret = ERR_PTR(ret2);
goto error5;
goto error6;
}
}
/* Add mad agent into port's agent list */
list_add_tail(&mad_agent_priv->agent_list, &port_priv->agent_list);
spin_unlock_irqrestore(&port_priv->reg_lock, flags);
spin_unlock_irq(&port_priv->reg_lock);
return &mad_agent_priv->agent;
error6:
spin_unlock_irq(&port_priv->reg_lock);
idr_lock(&ib_mad_clients);
idr_remove(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
idr_unlock(&ib_mad_clients);
error5:
spin_unlock_irqrestore(&port_priv->reg_lock, flags);
ib_mad_agent_security_cleanup(&mad_agent_priv->agent);
error4:
kfree(reg_req);
@ -575,7 +594,6 @@ static inline void deref_snoop_agent(struct ib_mad_snoop_private *mad_snoop_priv
static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
{
struct ib_mad_port_private *port_priv;
unsigned long flags;
/* Note that we could still be handling received MADs */
@ -587,10 +605,12 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
port_priv = mad_agent_priv->qp_info->port_priv;
cancel_delayed_work(&mad_agent_priv->timed_work);
spin_lock_irqsave(&port_priv->reg_lock, flags);
spin_lock_irq(&port_priv->reg_lock);
remove_mad_reg_req(mad_agent_priv);
list_del(&mad_agent_priv->agent_list);
spin_unlock_irqrestore(&port_priv->reg_lock, flags);
spin_unlock_irq(&port_priv->reg_lock);
idr_lock(&ib_mad_clients);
idr_remove(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
idr_unlock(&ib_mad_clients);
flush_workqueue(port_priv->wq);
ib_cancel_rmpp_recvs(mad_agent_priv);
@ -601,7 +621,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
ib_mad_agent_security_cleanup(&mad_agent_priv->agent);
kfree(mad_agent_priv->reg_req);
kfree(mad_agent_priv);
kfree_rcu(mad_agent_priv, rcu);
}
static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv)
@ -625,6 +645,8 @@ static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv)
/*
* ib_unregister_mad_agent - Unregisters a client from using MAD services
*
* Context: Process context.
*/
void ib_unregister_mad_agent(struct ib_mad_agent *mad_agent)
{
@ -1159,7 +1181,6 @@ int ib_send_mad(struct ib_mad_send_wr_private *mad_send_wr)
{
struct ib_mad_qp_info *qp_info;
struct list_head *list;
struct ib_send_wr *bad_send_wr;
struct ib_mad_agent *mad_agent;
struct ib_sge *sge;
unsigned long flags;
@ -1197,7 +1218,7 @@ int ib_send_mad(struct ib_mad_send_wr_private *mad_send_wr)
spin_lock_irqsave(&qp_info->send_queue.lock, flags);
if (qp_info->send_queue.count < qp_info->send_queue.max_active) {
ret = ib_post_send(mad_agent->qp, &mad_send_wr->send_wr.wr,
&bad_send_wr);
NULL);
list = &qp_info->send_queue.list;
} else {
ret = 0;
@ -1720,22 +1741,19 @@ find_mad_agent(struct ib_mad_port_private *port_priv,
struct ib_mad_agent_private *mad_agent = NULL;
unsigned long flags;
spin_lock_irqsave(&port_priv->reg_lock, flags);
if (ib_response_mad(mad_hdr)) {
u32 hi_tid;
struct ib_mad_agent_private *entry;
/*
* Routing is based on high 32 bits of transaction ID
* of MAD.
*/
hi_tid = be64_to_cpu(mad_hdr->tid) >> 32;
list_for_each_entry(entry, &port_priv->agent_list, agent_list) {
if (entry->agent.hi_tid == hi_tid) {
mad_agent = entry;
break;
}
}
rcu_read_lock();
mad_agent = idr_find(&ib_mad_clients, hi_tid);
if (mad_agent && !atomic_inc_not_zero(&mad_agent->refcount))
mad_agent = NULL;
rcu_read_unlock();
} else {
struct ib_mad_mgmt_class_table *class;
struct ib_mad_mgmt_method_table *method;
@ -1744,6 +1762,7 @@ find_mad_agent(struct ib_mad_port_private *port_priv,
const struct ib_vendor_mad *vendor_mad;
int index;
spin_lock_irqsave(&port_priv->reg_lock, flags);
/*
* Routing is based on version, class, and method
* For "newer" vendor MADs, also based on OUI
@ -1783,20 +1802,19 @@ find_mad_agent(struct ib_mad_port_private *port_priv,
~IB_MGMT_METHOD_RESP];
}
}
if (mad_agent)
atomic_inc(&mad_agent->refcount);
out:
spin_unlock_irqrestore(&port_priv->reg_lock, flags);
}
if (mad_agent) {
if (mad_agent->agent.recv_handler)
atomic_inc(&mad_agent->refcount);
else {
if (mad_agent && !mad_agent->agent.recv_handler) {
dev_notice(&port_priv->device->dev,
"No receive handler for client %p on port %d\n",
&mad_agent->agent, port_priv->port_num);
deref_mad_agent(mad_agent);
mad_agent = NULL;
}
}
out:
spin_unlock_irqrestore(&port_priv->reg_lock, flags);
return mad_agent;
}
@ -1896,8 +1914,8 @@ static inline int rcv_has_same_gid(const struct ib_mad_agent_private *mad_agent_
const struct ib_global_route *grh =
rdma_ah_read_grh(&attr);
if (ib_get_cached_gid(device, port_num,
grh->sgid_index, &sgid, NULL))
if (rdma_query_gid(device, port_num,
grh->sgid_index, &sgid))
return 0;
return !memcmp(sgid.raw, rwc->recv_buf.grh->dgid.raw,
16);
@ -2457,7 +2475,6 @@ static void ib_mad_send_done(struct ib_cq *cq, struct ib_wc *wc)
struct ib_mad_send_wr_private *mad_send_wr, *queued_send_wr;
struct ib_mad_qp_info *qp_info;
struct ib_mad_queue *send_queue;
struct ib_send_wr *bad_send_wr;
struct ib_mad_send_wc mad_send_wc;
unsigned long flags;
int ret;
@ -2507,7 +2524,7 @@ retry:
if (queued_send_wr) {
ret = ib_post_send(qp_info->qp, &queued_send_wr->send_wr.wr,
&bad_send_wr);
NULL);
if (ret) {
dev_err(&port_priv->device->dev,
"ib_post_send failed: %d\n", ret);
@ -2552,11 +2569,9 @@ static bool ib_mad_send_error(struct ib_mad_port_private *port_priv,
if (wc->status == IB_WC_WR_FLUSH_ERR) {
if (mad_send_wr->retry) {
/* Repost send */
struct ib_send_wr *bad_send_wr;
mad_send_wr->retry = 0;
ret = ib_post_send(qp_info->qp, &mad_send_wr->send_wr.wr,
&bad_send_wr);
NULL);
if (!ret)
return false;
}
@ -2872,7 +2887,7 @@ static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info,
int post, ret;
struct ib_mad_private *mad_priv;
struct ib_sge sg_list;
struct ib_recv_wr recv_wr, *bad_recv_wr;
struct ib_recv_wr recv_wr;
struct ib_mad_queue *recv_queue = &qp_info->recv_queue;
/* Initialize common scatter list fields */
@ -2916,7 +2931,7 @@ static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info,
post = (++recv_queue->count < recv_queue->max_active);
list_add_tail(&mad_priv->header.mad_list.list, &recv_queue->list);
spin_unlock_irqrestore(&recv_queue->lock, flags);
ret = ib_post_recv(qp_info->qp, &recv_wr, &bad_recv_wr);
ret = ib_post_recv(qp_info->qp, &recv_wr, NULL);
if (ret) {
spin_lock_irqsave(&recv_queue->lock, flags);
list_del(&mad_priv->header.mad_list.list);
@ -3159,7 +3174,6 @@ static int ib_mad_port_open(struct ib_device *device,
port_priv->device = device;
port_priv->port_num = port_num;
spin_lock_init(&port_priv->reg_lock);
INIT_LIST_HEAD(&port_priv->agent_list);
init_mad_qp(port_priv, &port_priv->qp_info[0]);
init_mad_qp(port_priv, &port_priv->qp_info[1]);
@ -3338,6 +3352,9 @@ int ib_mad_init(void)
INIT_LIST_HEAD(&ib_mad_port_list);
/* Client ID 0 is used for snoop-only clients */
idr_alloc(&ib_mad_clients, NULL, 0, 0, GFP_KERNEL);
if (ib_register_client(&mad_client)) {
pr_err("Couldn't register ib_mad client\n");
return -EINVAL;

View File

@ -89,7 +89,6 @@ struct ib_rmpp_segment {
};
struct ib_mad_agent_private {
struct list_head agent_list;
struct ib_mad_agent agent;
struct ib_mad_reg_req *reg_req;
struct ib_mad_qp_info *qp_info;
@ -105,7 +104,10 @@ struct ib_mad_agent_private {
struct list_head rmpp_list;
atomic_t refcount;
union {
struct completion comp;
struct rcu_head rcu;
};
};
struct ib_mad_snoop_private {
@ -203,7 +205,6 @@ struct ib_mad_port_private {
spinlock_t reg_lock;
struct ib_mad_mgmt_version_table version[MAX_MGMT_VERSION];
struct list_head agent_list;
struct workqueue_struct *wq;
struct ib_mad_qp_info qp_info[IB_MAD_QPS_CORE];
};

View File

@ -716,14 +716,28 @@ int ib_sa_get_mcmember_rec(struct ib_device *device, u8 port_num,
}
EXPORT_SYMBOL(ib_sa_get_mcmember_rec);
/**
* ib_init_ah_from_mcmember - Initialize AH attribute from multicast
* member record and gid of the device.
* @device: RDMA device
* @port_num: Port of the rdma device to consider
* @ndev: Optional netdevice, applicable only for RoCE
* @gid_type: GID type to consider
* @ah_attr: AH attribute to fillup on successful completion
*
* ib_init_ah_from_mcmember() initializes AH attribute based on multicast
* member record and other device properties. On success the caller is
* responsible to call rdma_destroy_ah_attr on the ah_attr. Returns 0 on
* success or appropriate error code.
*
*/
int ib_init_ah_from_mcmember(struct ib_device *device, u8 port_num,
struct ib_sa_mcmember_rec *rec,
struct net_device *ndev,
enum ib_gid_type gid_type,
struct rdma_ah_attr *ah_attr)
{
int ret;
u16 gid_index;
const struct ib_gid_attr *sgid_attr;
/* GID table is not based on the netdevice for IB link layer,
* so ignore ndev during search.
@ -733,26 +747,22 @@ int ib_init_ah_from_mcmember(struct ib_device *device, u8 port_num,
else if (!rdma_protocol_roce(device, port_num))
return -EINVAL;
ret = ib_find_cached_gid_by_port(device, &rec->port_gid,
gid_type, port_num,
ndev,
&gid_index);
if (ret)
return ret;
sgid_attr = rdma_find_gid_by_port(device, &rec->port_gid,
gid_type, port_num, ndev);
if (IS_ERR(sgid_attr))
return PTR_ERR(sgid_attr);
memset(ah_attr, 0, sizeof *ah_attr);
memset(ah_attr, 0, sizeof(*ah_attr));
ah_attr->type = rdma_ah_find_type(device, port_num);
rdma_ah_set_dlid(ah_attr, be16_to_cpu(rec->mlid));
rdma_ah_set_sl(ah_attr, rec->sl);
rdma_ah_set_port_num(ah_attr, port_num);
rdma_ah_set_static_rate(ah_attr, rec->rate);
rdma_ah_set_grh(ah_attr, &rec->mgid,
rdma_move_grh_sgid_attr(ah_attr, &rec->mgid,
be32_to_cpu(rec->flow_label),
(u8)gid_index,
rec->hop_limit,
rec->traffic_class);
rec->hop_limit, rec->traffic_class,
sgid_attr);
return 0;
}
EXPORT_SYMBOL(ib_init_ah_from_mcmember);

View File

@ -237,15 +237,15 @@ static int fill_port_info(struct sk_buff *msg,
if (ret)
return ret;
if (rdma_protocol_ib(device, port)) {
BUILD_BUG_ON(sizeof(attr.port_cap_flags) > sizeof(u64));
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CAP_FLAGS,
(u64)attr.port_cap_flags, RDMA_NLDEV_ATTR_PAD))
(u64)attr.port_cap_flags,
RDMA_NLDEV_ATTR_PAD))
return -EMSGSIZE;
if (rdma_protocol_ib(device, port) &&
nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SUBNET_PREFIX,
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SUBNET_PREFIX,
attr.subnet_prefix, RDMA_NLDEV_ATTR_PAD))
return -EMSGSIZE;
if (rdma_protocol_ib(device, port)) {
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_LID, attr.lid))
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_SM_LID, attr.sm_lid))

File diff suppressed because it is too large Load Diff

View File

@ -43,20 +43,12 @@
#include <rdma/ib_verbs.h>
#include <linux/mutex.h>
int uverbs_ns_idx(u16 *id, unsigned int ns_count);
const struct uverbs_object_spec *uverbs_get_object(const struct ib_device *ibdev,
uint16_t object);
const struct uverbs_method_spec *uverbs_get_method(const struct uverbs_object_spec *object,
uint16_t method);
/*
* These functions initialize the context and cleanups its uobjects.
* The context has a list of objects which is protected by a mutex
* on the context. initialize_ucontext should be called when we create
* a context.
* cleanup_ucontext removes all uobjects from the context and puts them.
*/
void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed);
void uverbs_initialize_ucontext(struct ib_ucontext *ucontext);
struct ib_uverbs_device;
void uverbs_destroy_ufile_hw(struct ib_uverbs_file *ufile,
enum rdma_remove_reason reason);
int uobj_destroy(struct ib_uobject *uobj);
/*
* uverbs_uobject_get is called in order to increase the reference count on
@ -82,7 +74,7 @@ void uverbs_uobject_put(struct ib_uobject *uobject);
void uverbs_close_fd(struct file *f);
/*
* Get an ib_uobject that corresponds to the given id from ucontext, assuming
* Get an ib_uobject that corresponds to the given id from ufile, assuming
* the object is from the given type. Lock it to the required access when
* applicable.
* This function could create (access == NEW), destroy (access == DESTROY)
@ -90,13 +82,11 @@ void uverbs_close_fd(struct file *f);
* The action will be finalized only when uverbs_finalize_object or
* uverbs_finalize_objects are called.
*/
struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
struct ib_ucontext *ucontext,
enum uverbs_obj_access access,
int id);
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit);
struct ib_uobject *
uverbs_get_uobject_from_file(u16 object_id,
struct ib_uverbs_file *ufile,
enum uverbs_obj_access access, s64 id);
/*
* Note that certain finalize stages could return a status:
* (a) alloc_commit could return a failure if the object is committed at the
@ -112,9 +102,63 @@ int uverbs_finalize_object(struct ib_uobject *uobj,
* function. For example, this could happen when we couldn't destroy an
* object.
*/
int uverbs_finalize_objects(struct uverbs_attr_bundle *attrs_bundle,
struct uverbs_attr_spec_hash * const *spec_hash,
size_t num,
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit);
void setup_ufile_idr_uobject(struct ib_uverbs_file *ufile);
void release_ufile_idr_uobject(struct ib_uverbs_file *ufile);
/*
* This is the runtime description of the uverbs API, used by the syscall
* machinery to validate and dispatch calls.
*/
/*
* Depending on ID the slot pointer in the radix tree points at one of these
* structs.
*/
struct uverbs_api_object {
const struct uverbs_obj_type *type_attrs;
const struct uverbs_obj_type_class *type_class;
};
struct uverbs_api_ioctl_method {
int (__rcu *handler)(struct ib_uverbs_file *ufile,
struct uverbs_attr_bundle *ctx);
DECLARE_BITMAP(attr_mandatory, UVERBS_API_ATTR_BKEY_LEN);
u16 bundle_size;
u8 use_stack:1;
u8 driver_method:1;
u8 key_bitmap_len;
u8 destroy_bkey;
};
struct uverbs_api_attr {
struct uverbs_attr_spec spec;
};
struct uverbs_api_object;
struct uverbs_api {
/* radix tree contains struct uverbs_api_* pointers */
struct radix_tree_root radix;
enum rdma_driver_id driver_id;
};
static inline const struct uverbs_api_object *
uapi_get_object(struct uverbs_api *uapi, u16 object_id)
{
return radix_tree_lookup(&uapi->radix, uapi_key_obj(object_id));
}
char *uapi_key_format(char *S, unsigned int key);
struct uverbs_api *uverbs_alloc_api(
const struct uverbs_object_tree_def *const *driver_specs,
enum rdma_driver_id driver_id);
void uverbs_disassociate_api_pre(struct ib_uverbs_device *uverbs_dev);
void uverbs_disassociate_api(struct uverbs_api *uapi);
void uverbs_destroy_api(struct uverbs_api *uapi);
void uapi_compute_bundle_size(struct uverbs_api_ioctl_method *method_elm,
unsigned int num_attrs);
#endif /* RDMA_CORE_H */

View File

@ -143,14 +143,15 @@ static enum bonding_slave_state is_eth_active_slave_of_bonding_rcu(struct net_de
#define REQUIRED_BOND_STATES (BONDING_SLAVE_STATE_ACTIVE | \
BONDING_SLAVE_STATE_NA)
static int is_eth_port_of_netdev(struct ib_device *ib_dev, u8 port,
static bool
is_eth_port_of_netdev_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
struct net_device *real_dev;
int res;
bool res;
if (!rdma_ndev)
return 0;
return false;
rcu_read_lock();
real_dev = rdma_vlan_dev_real_dev(cookie);
@ -166,14 +167,15 @@ static int is_eth_port_of_netdev(struct ib_device *ib_dev, u8 port,
return res;
}
static int is_eth_port_inactive_slave(struct ib_device *ib_dev, u8 port,
static bool
is_eth_port_inactive_slave_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
struct net_device *master_dev;
int res;
bool res;
if (!rdma_ndev)
return 0;
return false;
rcu_read_lock();
master_dev = netdev_master_upper_dev_get_rcu(rdma_ndev);
@ -184,22 +186,59 @@ static int is_eth_port_inactive_slave(struct ib_device *ib_dev, u8 port,
return res;
}
static int pass_all_filter(struct ib_device *ib_dev, u8 port,
/** is_ndev_for_default_gid_filter - Check if a given netdevice
* can be considered for default GIDs or not.
* @ib_dev: IB device to check
* @port: Port to consider for adding default GID
* @rdma_ndev: rdma netdevice pointer
* @cookie_ndev: Netdevice to consider to form a default GID
*
* is_ndev_for_default_gid_filter() returns true if a given netdevice can be
* considered for deriving default RoCE GID, returns false otherwise.
*/
static bool
is_ndev_for_default_gid_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
return 1;
}
static int upper_device_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
int res;
struct net_device *cookie_ndev = cookie;
bool res;
if (!rdma_ndev)
return 0;
return false;
rcu_read_lock();
/*
* When rdma netdevice is used in bonding, bonding master netdevice
* should be considered for default GIDs. Therefore, ignore slave rdma
* netdevices when bonding is considered.
* Additionally when event(cookie) netdevice is bond master device,
* make sure that it the upper netdevice of rdma netdevice.
*/
res = ((cookie_ndev == rdma_ndev && !netif_is_bond_slave(rdma_ndev)) ||
(netif_is_bond_master(cookie_ndev) &&
rdma_is_upper_dev_rcu(rdma_ndev, cookie_ndev)));
rcu_read_unlock();
return res;
}
static bool pass_all_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
return true;
}
static bool upper_device_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
bool res;
if (!rdma_ndev)
return false;
if (rdma_ndev == cookie)
return 1;
return true;
rcu_read_lock();
res = rdma_is_upper_dev_rcu(rdma_ndev, cookie);
@ -208,6 +247,34 @@ static int upper_device_filter(struct ib_device *ib_dev, u8 port,
return res;
}
/**
* is_upper_ndev_bond_master_filter - Check if a given netdevice
* is bond master device of netdevice of the the RDMA device of port.
* @ib_dev: IB device to check
* @port: Port to consider for adding default GID
* @rdma_ndev: Pointer to rdma netdevice
* @cookie: Netdevice to consider to form a default GID
*
* is_upper_ndev_bond_master_filter() returns true if a cookie_netdev
* is bond master device and rdma_ndev is its lower netdevice. It might
* not have been established as slave device yet.
*/
static bool
is_upper_ndev_bond_master_filter(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev,
void *cookie)
{
struct net_device *cookie_ndev = cookie;
bool match = false;
rcu_read_lock();
if (netif_is_bond_master(cookie_ndev) &&
rdma_is_upper_dev_rcu(rdma_ndev, cookie_ndev))
match = true;
rcu_read_unlock();
return match;
}
static void update_gid_ip(enum gid_op_type gid_op,
struct ib_device *ib_dev,
u8 port, struct net_device *ndev,
@ -223,34 +290,10 @@ static void update_gid_ip(enum gid_op_type gid_op,
update_gid(gid_op, ib_dev, port, &gid, &gid_attr);
}
static void enum_netdev_default_gids(struct ib_device *ib_dev,
u8 port, struct net_device *event_ndev,
struct net_device *rdma_ndev)
{
unsigned long gid_type_mask;
rcu_read_lock();
if (!rdma_ndev ||
((rdma_ndev != event_ndev &&
!rdma_is_upper_dev_rcu(rdma_ndev, event_ndev)) ||
is_eth_active_slave_of_bonding_rcu(rdma_ndev,
netdev_master_upper_dev_get_rcu(rdma_ndev)) ==
BONDING_SLAVE_STATE_INACTIVE)) {
rcu_read_unlock();
return;
}
rcu_read_unlock();
gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
ib_cache_gid_set_default_gid(ib_dev, port, rdma_ndev, gid_type_mask,
IB_CACHE_GID_DEFAULT_MODE_SET);
}
static void bond_delete_netdev_default_gids(struct ib_device *ib_dev,
u8 port,
struct net_device *event_ndev,
struct net_device *rdma_ndev)
struct net_device *rdma_ndev,
struct net_device *event_ndev)
{
struct net_device *real_dev = rdma_vlan_dev_real_dev(event_ndev);
unsigned long gid_type_mask;
@ -381,7 +424,6 @@ static void _add_netdev_ips(struct ib_device *ib_dev, u8 port,
static void add_netdev_ips(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
enum_netdev_default_gids(ib_dev, port, cookie, rdma_ndev);
_add_netdev_ips(ib_dev, port, cookie);
}
@ -391,6 +433,38 @@ static void del_netdev_ips(struct ib_device *ib_dev, u8 port,
ib_cache_gid_del_all_netdev_gids(ib_dev, port, cookie);
}
/**
* del_default_gids - Delete default GIDs of the event/cookie netdevice
* @ib_dev: RDMA device pointer
* @port: Port of the RDMA device whose GID table to consider
* @rdma_ndev: Unused rdma netdevice
* @cookie: Pointer to event netdevice
*
* del_default_gids() deletes the default GIDs of the event/cookie netdevice.
*/
static void del_default_gids(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
struct net_device *cookie_ndev = cookie;
unsigned long gid_type_mask;
gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
ib_cache_gid_set_default_gid(ib_dev, port, cookie_ndev, gid_type_mask,
IB_CACHE_GID_DEFAULT_MODE_DELETE);
}
static void add_default_gids(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
struct net_device *event_ndev = cookie;
unsigned long gid_type_mask;
gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
ib_cache_gid_set_default_gid(ib_dev, port, event_ndev, gid_type_mask,
IB_CACHE_GID_DEFAULT_MODE_SET);
}
static void enum_all_gids_of_dev_cb(struct ib_device *ib_dev,
u8 port,
struct net_device *rdma_ndev,
@ -405,9 +479,20 @@ static void enum_all_gids_of_dev_cb(struct ib_device *ib_dev,
rtnl_lock();
down_read(&net_rwsem);
for_each_net(net)
for_each_netdev(net, ndev)
if (is_eth_port_of_netdev(ib_dev, port, rdma_ndev, ndev))
add_netdev_ips(ib_dev, port, rdma_ndev, ndev);
for_each_netdev(net, ndev) {
/*
* Filter and add default GIDs of the primary netdevice
* when not in bonding mode, or add default GIDs
* of bond master device, when in bonding mode.
*/
if (is_ndev_for_default_gid_filter(ib_dev, port,
rdma_ndev, ndev))
add_default_gids(ib_dev, port, rdma_ndev, ndev);
if (is_eth_port_of_netdev_filter(ib_dev, port,
rdma_ndev, ndev))
_add_netdev_ips(ib_dev, port, ndev);
}
up_read(&net_rwsem);
rtnl_unlock();
}
@ -513,18 +598,12 @@ static void del_netdev_default_ips_join(struct ib_device *ib_dev, u8 port,
rcu_read_unlock();
if (master_ndev) {
bond_delete_netdev_default_gids(ib_dev, port, master_ndev,
rdma_ndev);
bond_delete_netdev_default_gids(ib_dev, port, rdma_ndev,
master_ndev);
dev_put(master_ndev);
}
}
static void del_netdev_default_ips(struct ib_device *ib_dev, u8 port,
struct net_device *rdma_ndev, void *cookie)
{
bond_delete_netdev_default_gids(ib_dev, port, cookie, rdma_ndev);
}
/* The following functions operate on all IB devices. netdevice_event and
* addr_event execute ib_enum_all_roce_netdevs through a work.
* ib_enum_all_roce_netdevs iterates through all IB devices.
@ -575,40 +654,94 @@ static int netdevice_queue_work(struct netdev_event_work_cmd *cmds,
}
static const struct netdev_event_work_cmd add_cmd = {
.cb = add_netdev_ips, .filter = is_eth_port_of_netdev};
static const struct netdev_event_work_cmd add_cmd_upper_ips = {
.cb = add_netdev_upper_ips, .filter = is_eth_port_of_netdev};
.cb = add_netdev_ips,
.filter = is_eth_port_of_netdev_filter
};
static void netdevice_event_changeupper(struct netdev_notifier_changeupper_info *changeupper_info,
static const struct netdev_event_work_cmd add_cmd_upper_ips = {
.cb = add_netdev_upper_ips,
.filter = is_eth_port_of_netdev_filter
};
static void
ndev_event_unlink(struct netdev_notifier_changeupper_info *changeupper_info,
struct netdev_event_work_cmd *cmds)
{
static const struct netdev_event_work_cmd upper_ips_del_cmd = {
.cb = del_netdev_upper_ips, .filter = upper_device_filter};
static const struct netdev_event_work_cmd bonding_default_del_cmd = {
.cb = del_netdev_default_ips, .filter = is_eth_port_inactive_slave};
static const struct netdev_event_work_cmd
upper_ips_del_cmd = {
.cb = del_netdev_upper_ips,
.filter = upper_device_filter
};
if (changeupper_info->linking == false) {
cmds[0] = upper_ips_del_cmd;
cmds[0].ndev = changeupper_info->upper_dev;
cmds[1] = add_cmd;
} else {
}
static const struct netdev_event_work_cmd bonding_default_add_cmd = {
.cb = add_default_gids,
.filter = is_upper_ndev_bond_master_filter
};
static void
ndev_event_link(struct net_device *event_ndev,
struct netdev_notifier_changeupper_info *changeupper_info,
struct netdev_event_work_cmd *cmds)
{
static const struct netdev_event_work_cmd
bonding_default_del_cmd = {
.cb = del_default_gids,
.filter = is_upper_ndev_bond_master_filter
};
/*
* When a lower netdev is linked to its upper bonding
* netdev, delete lower slave netdev's default GIDs.
*/
cmds[0] = bonding_default_del_cmd;
cmds[0].ndev = changeupper_info->upper_dev;
cmds[1] = add_cmd_upper_ips;
cmds[0].ndev = event_ndev;
cmds[0].filter_ndev = changeupper_info->upper_dev;
/* Now add bonding upper device default GIDs */
cmds[1] = bonding_default_add_cmd;
cmds[1].ndev = changeupper_info->upper_dev;
cmds[1].filter_ndev = changeupper_info->upper_dev;
}
/* Now add bonding upper device IP based GIDs */
cmds[2] = add_cmd_upper_ips;
cmds[2].ndev = changeupper_info->upper_dev;
cmds[2].filter_ndev = changeupper_info->upper_dev;
}
static void netdevice_event_changeupper(struct net_device *event_ndev,
struct netdev_notifier_changeupper_info *changeupper_info,
struct netdev_event_work_cmd *cmds)
{
if (changeupper_info->linking)
ndev_event_link(event_ndev, changeupper_info, cmds);
else
ndev_event_unlink(changeupper_info, cmds);
}
static const struct netdev_event_work_cmd add_default_gid_cmd = {
.cb = add_default_gids,
.filter = is_ndev_for_default_gid_filter,
};
static int netdevice_event(struct notifier_block *this, unsigned long event,
void *ptr)
{
static const struct netdev_event_work_cmd del_cmd = {
.cb = del_netdev_ips, .filter = pass_all_filter};
static const struct netdev_event_work_cmd bonding_default_del_cmd_join = {
.cb = del_netdev_default_ips_join, .filter = is_eth_port_inactive_slave};
static const struct netdev_event_work_cmd default_del_cmd = {
.cb = del_netdev_default_ips, .filter = pass_all_filter};
static const struct netdev_event_work_cmd
bonding_default_del_cmd_join = {
.cb = del_netdev_default_ips_join,
.filter = is_eth_port_inactive_slave_filter
};
static const struct netdev_event_work_cmd
netdev_del_cmd = {
.cb = del_netdev_ips,
.filter = is_eth_port_of_netdev_filter
};
static const struct netdev_event_work_cmd bonding_event_ips_del_cmd = {
.cb = del_netdev_upper_ips, .filter = upper_device_filter};
struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
@ -621,7 +754,8 @@ static int netdevice_event(struct notifier_block *this, unsigned long event,
case NETDEV_REGISTER:
case NETDEV_UP:
cmds[0] = bonding_default_del_cmd_join;
cmds[1] = add_cmd;
cmds[1] = add_default_gid_cmd;
cmds[2] = add_cmd;
break;
case NETDEV_UNREGISTER:
@ -632,19 +766,22 @@ static int netdevice_event(struct notifier_block *this, unsigned long event,
break;
case NETDEV_CHANGEADDR:
cmds[0] = default_del_cmd;
cmds[1] = add_cmd;
cmds[0] = netdev_del_cmd;
cmds[1] = add_default_gid_cmd;
cmds[2] = add_cmd;
break;
case NETDEV_CHANGEUPPER:
netdevice_event_changeupper(
netdevice_event_changeupper(ndev,
container_of(ptr, struct netdev_notifier_changeupper_info, info),
cmds);
break;
case NETDEV_BONDING_FAILOVER:
cmds[0] = bonding_event_ips_del_cmd;
cmds[1] = bonding_default_del_cmd_join;
/* Add default GIDs of the bond device */
cmds[1] = bonding_default_add_cmd;
/* Add IP based GIDs of the bond device */
cmds[2] = add_cmd_upper_ips;
break;
@ -660,7 +797,8 @@ static void update_gid_event_work_handler(struct work_struct *_work)
struct update_gid_event_work *work =
container_of(_work, struct update_gid_event_work, work);
ib_enum_all_roce_netdevs(is_eth_port_of_netdev, work->gid_attr.ndev,
ib_enum_all_roce_netdevs(is_eth_port_of_netdev_filter,
work->gid_attr.ndev,
callback_for_addr_gid_device_scan, work);
dev_put(work->gid_attr.ndev);

View File

@ -87,7 +87,7 @@ static int rdma_rw_init_one_mr(struct ib_qp *qp, u8 port_num,
}
ret = ib_map_mr_sg(reg->mr, sg, nents, &offset, PAGE_SIZE);
if (ret < nents) {
if (ret < 0 || ret < nents) {
ib_mr_pool_put(qp, &qp->rdma_mrs, reg->mr);
return -EINVAL;
}
@ -325,7 +325,7 @@ out_unmap_sg:
EXPORT_SYMBOL(rdma_rw_ctx_init);
/**
* rdma_rw_ctx_signature init - initialize a RW context with signature offload
* rdma_rw_ctx_signature_init - initialize a RW context with signature offload
* @ctx: context to initialize
* @qp: queue pair to operate on
* @port_num: port num to which the connection is bound
@ -564,10 +564,10 @@ EXPORT_SYMBOL(rdma_rw_ctx_wrs);
int rdma_rw_ctx_post(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u8 port_num,
struct ib_cqe *cqe, struct ib_send_wr *chain_wr)
{
struct ib_send_wr *first_wr, *bad_wr;
struct ib_send_wr *first_wr;
first_wr = rdma_rw_ctx_wrs(ctx, qp, port_num, cqe, chain_wr);
return ib_post_send(qp, first_wr, &bad_wr);
return ib_post_send(qp, first_wr, NULL);
}
EXPORT_SYMBOL(rdma_rw_ctx_post);

View File

@ -1227,20 +1227,10 @@ static u8 get_src_path_mask(struct ib_device *device, u8 port_num)
return src_path_mask;
}
static int
roce_resolve_route_from_path(struct ib_device *device, u8 port_num,
struct sa_path_rec *rec)
static int roce_resolve_route_from_path(struct sa_path_rec *rec,
const struct ib_gid_attr *attr)
{
struct net_device *resolved_dev;
struct net_device *ndev;
struct net_device *idev;
struct rdma_dev_addr dev_addr = {
.bound_dev_if = ((sa_path_get_ifindex(rec) >= 0) ?
sa_path_get_ifindex(rec) : 0),
.net = sa_path_get_ndev(rec) ?
sa_path_get_ndev(rec) :
&init_net
};
struct rdma_dev_addr dev_addr = {};
union {
struct sockaddr _sockaddr;
struct sockaddr_in _sockaddr_in;
@ -1250,9 +1240,14 @@ roce_resolve_route_from_path(struct ib_device *device, u8 port_num,
if (rec->roce.route_resolved)
return 0;
if (!attr || !attr->ndev)
return -EINVAL;
if (!device->get_netdev)
return -EOPNOTSUPP;
dev_addr.bound_dev_if = attr->ndev->ifindex;
/* TODO: Use net from the ib_gid_attr once it is added to it,
* until than, limit itself to init_net.
*/
dev_addr.net = &init_net;
rdma_gid2ip(&sgid_addr._sockaddr, &rec->sgid);
rdma_gid2ip(&dgid_addr._sockaddr, &rec->dgid);
@ -1268,60 +1263,52 @@ roce_resolve_route_from_path(struct ib_device *device, u8 port_num,
rec->rec_type != SA_PATH_REC_TYPE_ROCE_V2)
return -EINVAL;
idev = device->get_netdev(device, port_num);
if (!idev)
return -ENODEV;
resolved_dev = dev_get_by_index(dev_addr.net,
dev_addr.bound_dev_if);
if (!resolved_dev) {
ret = -ENODEV;
goto done;
}
ndev = ib_get_ndev_from_path(rec);
rcu_read_lock();
if ((ndev && ndev != resolved_dev) ||
(resolved_dev != idev &&
!rdma_is_upper_dev_rcu(idev, resolved_dev)))
ret = -EHOSTUNREACH;
rcu_read_unlock();
dev_put(resolved_dev);
if (ndev)
dev_put(ndev);
done:
dev_put(idev);
if (!ret)
rec->roce.route_resolved = true;
return ret;
return 0;
}
static int init_ah_attr_grh_fields(struct ib_device *device, u8 port_num,
struct sa_path_rec *rec,
struct rdma_ah_attr *ah_attr)
struct rdma_ah_attr *ah_attr,
const struct ib_gid_attr *gid_attr)
{
enum ib_gid_type type = sa_conv_pathrec_to_gid_type(rec);
struct net_device *ndev;
u16 gid_index;
int ret;
ndev = ib_get_ndev_from_path(rec);
ret = ib_find_cached_gid_by_port(device, &rec->sgid, type,
port_num, ndev, &gid_index);
if (ndev)
dev_put(ndev);
if (ret)
return ret;
if (!gid_attr) {
gid_attr = rdma_find_gid_by_port(device, &rec->sgid, type,
port_num, NULL);
if (IS_ERR(gid_attr))
return PTR_ERR(gid_attr);
} else
rdma_hold_gid_attr(gid_attr);
rdma_ah_set_grh(ah_attr, &rec->dgid,
rdma_move_grh_sgid_attr(ah_attr, &rec->dgid,
be32_to_cpu(rec->flow_label),
gid_index, rec->hop_limit,
rec->traffic_class);
rec->hop_limit, rec->traffic_class,
gid_attr);
return 0;
}
/**
* ib_init_ah_attr_from_path - Initialize address handle attributes based on
* an SA path record.
* @device: Device associated ah attributes initialization.
* @port_num: Port on the specified device.
* @rec: path record entry to use for ah attributes initialization.
* @ah_attr: address handle attributes to initialization from path record.
* @sgid_attr: SGID attribute to consider during initialization.
*
* When ib_init_ah_attr_from_path() returns success,
* (a) for IB link layer it optionally contains a reference to SGID attribute
* when GRH is present for IB link layer.
* (b) for RoCE link layer it contains a reference to SGID attribute.
* User must invoke rdma_destroy_ah_attr() to release reference to SGID
* attributes which are initialized using ib_init_ah_attr_from_path().
*/
int ib_init_ah_attr_from_path(struct ib_device *device, u8 port_num,
struct sa_path_rec *rec,
struct rdma_ah_attr *ah_attr)
struct rdma_ah_attr *ah_attr,
const struct ib_gid_attr *gid_attr)
{
int ret = 0;
@ -1332,7 +1319,7 @@ int ib_init_ah_attr_from_path(struct ib_device *device, u8 port_num,
rdma_ah_set_static_rate(ah_attr, rec->rate);
if (sa_path_is_roce(rec)) {
ret = roce_resolve_route_from_path(device, port_num, rec);
ret = roce_resolve_route_from_path(rec, gid_attr);
if (ret)
return ret;
@ -1349,7 +1336,8 @@ int ib_init_ah_attr_from_path(struct ib_device *device, u8 port_num,
}
if (rec->hop_limit > 0 || sa_path_is_roce(rec))
ret = init_ah_attr_grh_fields(device, port_num, rec, ah_attr);
ret = init_ah_attr_grh_fields(device, port_num,
rec, ah_attr, gid_attr);
return ret;
}
EXPORT_SYMBOL(ib_init_ah_attr_from_path);
@ -1557,8 +1545,6 @@ static void ib_sa_path_rec_callback(struct ib_sa_query *sa_query,
ARRAY_SIZE(path_rec_table),
mad->data, &rec);
rec.rec_type = SA_PATH_REC_TYPE_IB;
sa_path_set_ndev(&rec, NULL);
sa_path_set_ifindex(&rec, 0);
sa_path_set_dmac_zero(&rec);
if (query->conv_pr) {
@ -2290,6 +2276,7 @@ static void update_sm_ah(struct work_struct *work)
struct ib_sa_sm_ah *new_ah;
struct ib_port_attr port_attr;
struct rdma_ah_attr ah_attr;
bool grh_required;
if (ib_query_port(port->agent->device, port->port_num, &port_attr)) {
pr_warn("Couldn't query port\n");
@ -2314,17 +2301,28 @@ static void update_sm_ah(struct work_struct *work)
rdma_ah_set_dlid(&ah_attr, port_attr.sm_lid);
rdma_ah_set_sl(&ah_attr, port_attr.sm_sl);
rdma_ah_set_port_num(&ah_attr, port->port_num);
if (port_attr.grh_required) {
if (ah_attr.type == RDMA_AH_ATTR_TYPE_OPA) {
grh_required = rdma_is_grh_required(port->agent->device,
port->port_num);
/*
* The OPA sm_lid of 0xFFFF needs special handling so that it can be
* differentiated from a permissive LID of 0xFFFF. We set the
* grh_required flag here so the SA can program the DGID in the
* address handle appropriately
*/
if (ah_attr.type == RDMA_AH_ATTR_TYPE_OPA &&
(grh_required ||
port_attr.sm_lid == be16_to_cpu(IB_LID_PERMISSIVE)))
rdma_ah_set_make_grd(&ah_attr, true);
} else {
if (ah_attr.type == RDMA_AH_ATTR_TYPE_IB && grh_required) {
rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
rdma_ah_set_subnet_prefix(&ah_attr,
cpu_to_be64(port_attr.subnet_prefix));
rdma_ah_set_interface_id(&ah_attr,
cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
}
}
new_ah->ah = rdma_create_ah(port->agent->qp->pd, &ah_attr);
if (IS_ERR(new_ah->ah)) {

View File

@ -42,6 +42,7 @@
#include <rdma/ib_mad.h>
#include <rdma/ib_pma.h>
#include <rdma/ib_cache.h>
struct ib_port;
@ -346,7 +347,7 @@ static struct attribute *port_default_attrs[] = {
NULL
};
static size_t print_ndev(struct ib_gid_attr *gid_attr, char *buf)
static size_t print_ndev(const struct ib_gid_attr *gid_attr, char *buf)
{
if (!gid_attr->ndev)
return -EINVAL;
@ -354,33 +355,26 @@ static size_t print_ndev(struct ib_gid_attr *gid_attr, char *buf)
return sprintf(buf, "%s\n", gid_attr->ndev->name);
}
static size_t print_gid_type(struct ib_gid_attr *gid_attr, char *buf)
static size_t print_gid_type(const struct ib_gid_attr *gid_attr, char *buf)
{
return sprintf(buf, "%s\n", ib_cache_gid_type_str(gid_attr->gid_type));
}
static ssize_t _show_port_gid_attr(struct ib_port *p,
struct port_attribute *attr,
char *buf,
size_t (*print)(struct ib_gid_attr *gid_attr,
char *buf))
static ssize_t _show_port_gid_attr(
struct ib_port *p, struct port_attribute *attr, char *buf,
size_t (*print)(const struct ib_gid_attr *gid_attr, char *buf))
{
struct port_table_attribute *tab_attr =
container_of(attr, struct port_table_attribute, attr);
union ib_gid gid;
struct ib_gid_attr gid_attr = {};
const struct ib_gid_attr *gid_attr;
ssize_t ret;
ret = ib_query_gid(p->ibdev, p->port_num, tab_attr->index, &gid,
&gid_attr);
if (ret)
goto err;
gid_attr = rdma_get_gid_attr(p->ibdev, p->port_num, tab_attr->index);
if (IS_ERR(gid_attr))
return PTR_ERR(gid_attr);
ret = print(&gid_attr, buf);
err:
if (gid_attr.ndev)
dev_put(gid_attr.ndev);
ret = print(gid_attr, buf);
rdma_put_gid_attr(gid_attr);
return ret;
}
@ -389,26 +383,28 @@ static ssize_t show_port_gid(struct ib_port *p, struct port_attribute *attr,
{
struct port_table_attribute *tab_attr =
container_of(attr, struct port_table_attribute, attr);
union ib_gid *pgid;
union ib_gid gid;
const struct ib_gid_attr *gid_attr;
ssize_t ret;
ret = ib_query_gid(p->ibdev, p->port_num, tab_attr->index, &gid, NULL);
gid_attr = rdma_get_gid_attr(p->ibdev, p->port_num, tab_attr->index);
if (IS_ERR(gid_attr)) {
const union ib_gid zgid = {};
/* If reading GID fails, it is likely due to GID entry being empty
* (invalid) or reserved GID in the table.
* User space expects to read GID table entries as long as it given
* index is within GID table size.
* Administrative/debugging tool fails to query rest of the GID entries
* if it hits error while querying a GID of the given index.
* To avoid user space throwing such error on fail to read gid, return
* zero GID as before. This maintains backward compatibility.
/* If reading GID fails, it is likely due to GID entry being
* empty (invalid) or reserved GID in the table. User space
* expects to read GID table entries as long as it given index
* is within GID table size. Administrative/debugging tool
* fails to query rest of the GID entries if it hits error
* while querying a GID of the given index. To avoid user
* space throwing such error on fail to read gid, return zero
* GID as before. This maintains backward compatibility.
*/
if (ret)
pgid = &zgid;
else
pgid = &gid;
return sprintf(buf, "%pI6\n", pgid->raw);
return sprintf(buf, "%pI6\n", zgid.raw);
}
ret = sprintf(buf, "%pI6\n", gid_attr->gid.raw);
rdma_put_gid_attr(gid_attr);
return ret;
}
static ssize_t show_port_gid_attr_ndev(struct ib_port *p,

View File

@ -207,7 +207,7 @@ error:
}
static void ib_ucm_event_req_get(struct ib_ucm_req_event_resp *ureq,
struct ib_cm_req_event_param *kreq)
const struct ib_cm_req_event_param *kreq)
{
ureq->remote_ca_guid = kreq->remote_ca_guid;
ureq->remote_qkey = kreq->remote_qkey;
@ -231,7 +231,7 @@ static void ib_ucm_event_req_get(struct ib_ucm_req_event_resp *ureq,
}
static void ib_ucm_event_rep_get(struct ib_ucm_rep_event_resp *urep,
struct ib_cm_rep_event_param *krep)
const struct ib_cm_rep_event_param *krep)
{
urep->remote_ca_guid = krep->remote_ca_guid;
urep->remote_qkey = krep->remote_qkey;
@ -247,14 +247,14 @@ static void ib_ucm_event_rep_get(struct ib_ucm_rep_event_resp *urep,
}
static void ib_ucm_event_sidr_rep_get(struct ib_ucm_sidr_rep_event_resp *urep,
struct ib_cm_sidr_rep_event_param *krep)
const struct ib_cm_sidr_rep_event_param *krep)
{
urep->status = krep->status;
urep->qkey = krep->qkey;
urep->qpn = krep->qpn;
};
static int ib_ucm_event_process(struct ib_cm_event *evt,
static int ib_ucm_event_process(const struct ib_cm_event *evt,
struct ib_ucm_event *uvt)
{
void *info = NULL;
@ -351,7 +351,7 @@ err1:
}
static int ib_ucm_event_handler(struct ib_cm_id *cm_id,
struct ib_cm_event *event)
const struct ib_cm_event *event)
{
struct ib_ucm_event *uevent;
struct ib_ucm_context *ctx;
@ -1000,14 +1000,11 @@ static ssize_t ib_ucm_send_sidr_req(struct ib_ucm_file *file,
const char __user *inbuf,
int in_len, int out_len)
{
struct ib_cm_sidr_req_param param;
struct ib_cm_sidr_req_param param = {};
struct ib_ucm_context *ctx;
struct ib_ucm_sidr_req cmd;
int result;
param.private_data = NULL;
param.path = NULL;
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
return -EFAULT;

View File

@ -84,7 +84,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
struct ib_umem *umem;
struct page **page_list;
struct vm_area_struct **vma_list;
unsigned long locked;
unsigned long lock_limit;
unsigned long cur_base;
unsigned long npages;
@ -92,7 +91,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
int i;
unsigned long dma_attrs = 0;
struct scatterlist *sg, *sg_list_start;
int need_release = 0;
unsigned int gup_flags = FOLL_WRITE;
if (dmasync)
@ -121,10 +119,8 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
if (access & IB_ACCESS_ON_DEMAND) {
ret = ib_umem_odp_get(context, umem, access);
if (ret) {
kfree(umem);
return ERR_PTR(ret);
}
if (ret)
goto umem_kfree;
return umem;
}
@ -135,8 +131,8 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
kfree(umem);
return ERR_PTR(-ENOMEM);
ret = -ENOMEM;
goto umem_kfree;
}
/*
@ -149,41 +145,43 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
npages = ib_umem_num_pages(umem);
down_write(&current->mm->mmap_sem);
locked = npages + current->mm->pinned_vm;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) {
down_write(&current->mm->mmap_sem);
current->mm->pinned_vm += npages;
if ((current->mm->pinned_vm > lock_limit) && !capable(CAP_IPC_LOCK)) {
up_write(&current->mm->mmap_sem);
ret = -ENOMEM;
goto out;
goto vma;
}
up_write(&current->mm->mmap_sem);
cur_base = addr & PAGE_MASK;
if (npages == 0 || npages > UINT_MAX) {
ret = -EINVAL;
goto out;
goto vma;
}
ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL);
if (ret)
goto out;
goto vma;
if (!umem->writable)
gup_flags |= FOLL_FORCE;
need_release = 1;
sg_list_start = umem->sg_head.sgl;
down_read(&current->mm->mmap_sem);
while (npages) {
ret = get_user_pages_longterm(cur_base,
min_t(unsigned long, npages,
PAGE_SIZE / sizeof (struct page *)),
gup_flags, page_list, vma_list);
if (ret < 0)
goto out;
if (ret < 0) {
up_read(&current->mm->mmap_sem);
goto umem_release;
}
umem->npages += ret;
cur_base += ret * PAGE_SIZE;
@ -199,6 +197,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
/* preparing for next loop */
sg_list_start = sg;
}
up_read(&current->mm->mmap_sem);
umem->nmap = ib_dma_map_sg_attrs(context->device,
umem->sg_head.sgl,
@ -206,27 +205,28 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
DMA_BIDIRECTIONAL,
dma_attrs);
if (umem->nmap <= 0) {
if (!umem->nmap) {
ret = -ENOMEM;
goto out;
goto umem_release;
}
ret = 0;
goto out;
out:
if (ret < 0) {
if (need_release)
umem_release:
__ib_umem_release(context->device, umem, 0);
kfree(umem);
} else
current->mm->pinned_vm = locked;
vma:
down_write(&current->mm->mmap_sem);
current->mm->pinned_vm -= ib_umem_num_pages(umem);
up_write(&current->mm->mmap_sem);
out:
if (vma_list)
free_page((unsigned long) vma_list);
free_page((unsigned long) page_list);
return ret < 0 ? ERR_PTR(ret) : umem;
umem_kfree:
if (ret)
kfree(umem);
return ret ? ERR_PTR(ret) : umem;
}
EXPORT_SYMBOL(ib_umem_get);

View File

@ -268,6 +268,7 @@ static void recv_handler(struct ib_mad_agent *agent,
packet->mad.hdr.traffic_class = grh->traffic_class;
memcpy(packet->mad.hdr.gid, &grh->dgid, 16);
packet->mad.hdr.flow_label = cpu_to_be32(grh->flow_label);
rdma_destroy_ah_attr(&ah_attr);
}
if (queue_packet(file, agent, packet))

View File

@ -111,7 +111,7 @@ struct ib_uverbs_device {
struct mutex lists_mutex; /* protect lists */
struct list_head uverbs_file_list;
struct list_head uverbs_events_file_list;
struct uverbs_root_spec *specs_root;
struct uverbs_api *uapi;
};
struct ib_uverbs_event_queue {
@ -130,21 +130,37 @@ struct ib_uverbs_async_event_file {
};
struct ib_uverbs_completion_event_file {
struct ib_uobject_file uobj_file;
struct ib_uobject uobj;
struct ib_uverbs_event_queue ev_queue;
};
struct ib_uverbs_file {
struct kref ref;
struct mutex mutex;
struct mutex cleanup_mutex; /* protect cleanup */
struct ib_uverbs_device *device;
struct mutex ucontext_lock;
/*
* ucontext must be accessed via ib_uverbs_get_ucontext() or with
* ucontext_lock held
*/
struct ib_ucontext *ucontext;
struct ib_event_handler event_handler;
struct ib_uverbs_async_event_file *async_file;
struct list_head list;
int is_closed;
/*
* To access the uobjects list hw_destroy_rwsem must be held for write
* OR hw_destroy_rwsem held for read AND uobjects_lock held.
* hw_destroy_rwsem should be called across any destruction of the HW
* object of an associated uobject.
*/
struct rw_semaphore hw_destroy_rwsem;
spinlock_t uobjects_lock;
struct list_head uobjects;
u64 uverbs_cmd_mask;
u64 uverbs_ex_cmd_mask;
struct idr idr;
/* spinlock protects write access to idr */
spinlock_t idr_lock;
@ -196,7 +212,6 @@ struct ib_uwq_object {
struct ib_ucq_object {
struct ib_uobject uobject;
struct ib_uverbs_file *uverbs_file;
struct list_head comp_list;
struct list_head async_list;
u32 comp_events_reported;
@ -230,7 +245,7 @@ void ib_uverbs_wq_event_handler(struct ib_event *event, void *context_ptr);
void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr);
void ib_uverbs_event_handler(struct ib_event_handler *handler,
struct ib_event *event);
int ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd,
int ib_uverbs_dealloc_xrcd(struct ib_uobject *uobject, struct ib_xrcd *xrcd,
enum rdma_remove_reason why);
int uverbs_dealloc_mw(struct ib_mw *mw);
@ -238,12 +253,7 @@ void ib_uverbs_detach_umcast(struct ib_qp *qp,
struct ib_uqp_object *uobj);
void create_udata(struct uverbs_attr_bundle *ctx, struct ib_udata *udata);
extern const struct uverbs_attr_def uverbs_uhw_compat_in;
extern const struct uverbs_attr_def uverbs_uhw_compat_out;
long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
int uverbs_destroy_def_handler(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs);
struct ib_uverbs_flow_spec {
union {
@ -292,7 +302,6 @@ extern const struct uverbs_object_def UVERBS_OBJECT(UVERBS_OBJECT_COUNTERS);
#define IB_UVERBS_DECLARE_CMD(name) \
ssize_t ib_uverbs_##name(struct ib_uverbs_file *file, \
struct ib_device *ib_dev, \
const char __user *buf, int in_len, \
int out_len)
@ -334,7 +343,6 @@ IB_UVERBS_DECLARE_CMD(close_xrcd);
#define IB_UVERBS_DECLARE_EX_CMD(name) \
int ib_uverbs_ex_##name(struct ib_uverbs_file *file, \
struct ib_device *ib_dev, \
struct ib_udata *ucore, \
struct ib_udata *uhw)

File diff suppressed because it is too large Load Diff

View File

@ -35,6 +35,103 @@
#include "rdma_core.h"
#include "uverbs.h"
struct bundle_alloc_head {
struct bundle_alloc_head *next;
u8 data[];
};
struct bundle_priv {
/* Must be first */
struct bundle_alloc_head alloc_head;
struct bundle_alloc_head *allocated_mem;
size_t internal_avail;
size_t internal_used;
struct radix_tree_root *radix;
const struct uverbs_api_ioctl_method *method_elm;
void __rcu **radix_slots;
unsigned long radix_slots_len;
u32 method_key;
struct ib_uverbs_attr __user *user_attrs;
struct ib_uverbs_attr *uattrs;
DECLARE_BITMAP(uobj_finalize, UVERBS_API_ATTR_BKEY_LEN);
/*
* Must be last. bundle ends in a flex array which overlaps
* internal_buffer.
*/
struct uverbs_attr_bundle bundle;
u64 internal_buffer[32];
};
/*
* Each method has an absolute minimum amount of memory it needs to allocate,
* precompute that amount and determine if the onstack memory can be used or
* if allocation is need.
*/
void uapi_compute_bundle_size(struct uverbs_api_ioctl_method *method_elm,
unsigned int num_attrs)
{
struct bundle_priv *pbundle;
size_t bundle_size =
offsetof(struct bundle_priv, internal_buffer) +
sizeof(*pbundle->bundle.attrs) * method_elm->key_bitmap_len +
sizeof(*pbundle->uattrs) * num_attrs;
method_elm->use_stack = bundle_size <= sizeof(*pbundle);
method_elm->bundle_size =
ALIGN(bundle_size + 256, sizeof(*pbundle->internal_buffer));
/* Do not want order-2 allocations for this. */
WARN_ON_ONCE(method_elm->bundle_size > PAGE_SIZE);
}
/**
* uverbs_alloc() - Quickly allocate memory for use with a bundle
* @bundle: The bundle
* @size: Number of bytes to allocate
* @flags: Allocator flags
*
* The bundle allocator is intended for allocations that are connected with
* processing the system call related to the bundle. The allocated memory is
* always freed once the system call completes, and cannot be freed any other
* way.
*
* This tries to use a small pool of pre-allocated memory for performance.
*/
__malloc void *_uverbs_alloc(struct uverbs_attr_bundle *bundle, size_t size,
gfp_t flags)
{
struct bundle_priv *pbundle =
container_of(bundle, struct bundle_priv, bundle);
size_t new_used;
void *res;
if (check_add_overflow(size, pbundle->internal_used, &new_used))
return ERR_PTR(-EOVERFLOW);
if (new_used > pbundle->internal_avail) {
struct bundle_alloc_head *buf;
buf = kvmalloc(struct_size(buf, data, size), flags);
if (!buf)
return ERR_PTR(-ENOMEM);
buf->next = pbundle->allocated_mem;
pbundle->allocated_mem = buf;
return buf->data;
}
res = (void *)pbundle->internal_buffer + pbundle->internal_used;
pbundle->internal_used =
ALIGN(new_used, sizeof(*pbundle->internal_buffer));
if (flags & __GFP_ZERO)
memset(res, 0, size);
return res;
}
EXPORT_SYMBOL(_uverbs_alloc);
static bool uverbs_is_attr_cleared(const struct ib_uverbs_attr *uattr,
u16 len)
{
@ -46,45 +143,24 @@ static bool uverbs_is_attr_cleared(const struct ib_uverbs_attr *uattr,
0, uattr->len - len);
}
static int uverbs_process_attr(struct ib_device *ibdev,
struct ib_ucontext *ucontext,
const struct ib_uverbs_attr *uattr,
u16 attr_id,
const struct uverbs_attr_spec_hash *attr_spec_bucket,
struct uverbs_attr_bundle_hash *attr_bundle_h,
struct ib_uverbs_attr __user *uattr_ptr)
static int uverbs_process_attr(struct bundle_priv *pbundle,
const struct uverbs_api_attr *attr_uapi,
struct ib_uverbs_attr *uattr, u32 attr_bkey)
{
const struct uverbs_attr_spec *spec;
const struct uverbs_attr_spec *val_spec;
struct uverbs_attr *e;
const struct uverbs_object_spec *object;
const struct uverbs_attr_spec *spec = &attr_uapi->spec;
struct uverbs_attr *e = &pbundle->bundle.attrs[attr_bkey];
const struct uverbs_attr_spec *val_spec = spec;
struct uverbs_obj_attr *o_attr;
struct uverbs_attr *elements = attr_bundle_h->attrs;
if (attr_id >= attr_spec_bucket->num_attrs) {
if (uattr->flags & UVERBS_ATTR_F_MANDATORY)
return -EINVAL;
else
return 0;
}
if (test_bit(attr_id, attr_bundle_h->valid_bitmap))
return -EINVAL;
spec = &attr_spec_bucket->attrs[attr_id];
val_spec = spec;
e = &elements[attr_id];
e->uattr = uattr_ptr;
switch (spec->type) {
case UVERBS_ATTR_TYPE_ENUM_IN:
if (uattr->attr_data.enum_data.elem_id >= spec->enum_def.num_elems)
if (uattr->attr_data.enum_data.elem_id >= spec->u.enum_def.num_elems)
return -EOPNOTSUPP;
if (uattr->attr_data.enum_data.reserved)
return -EINVAL;
val_spec = &spec->enum_def.ids[uattr->attr_data.enum_data.elem_id];
val_spec = &spec->u2.enum_def.ids[uattr->attr_data.enum_data.elem_id];
/* Currently we only support PTR_IN based enums */
if (val_spec->type != UVERBS_ATTR_TYPE_PTR_IN)
@ -98,285 +174,301 @@ static int uverbs_process_attr(struct ib_device *ibdev,
* longer struct will fail here if used with an old kernel and
* non-zero content, making ABI compat/discovery simpler.
*/
if (uattr->len > val_spec->ptr.len &&
val_spec->flags & UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO &&
!uverbs_is_attr_cleared(uattr, val_spec->ptr.len))
if (uattr->len > val_spec->u.ptr.len &&
val_spec->zero_trailing &&
!uverbs_is_attr_cleared(uattr, val_spec->u.ptr.len))
return -EOPNOTSUPP;
/* fall through */
case UVERBS_ATTR_TYPE_PTR_OUT:
if (uattr->len < val_spec->ptr.min_len ||
(!(val_spec->flags & UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO) &&
uattr->len > val_spec->ptr.len))
if (uattr->len < val_spec->u.ptr.min_len ||
(!val_spec->zero_trailing &&
uattr->len > val_spec->u.ptr.len))
return -EINVAL;
if (spec->type != UVERBS_ATTR_TYPE_ENUM_IN &&
uattr->attr_data.reserved)
return -EINVAL;
e->ptr_attr.data = uattr->data;
e->ptr_attr.uattr_idx = uattr - pbundle->uattrs;
e->ptr_attr.len = uattr->len;
e->ptr_attr.flags = uattr->flags;
if (val_spec->alloc_and_copy && !uverbs_attr_ptr_is_inline(e)) {
void *p;
p = uverbs_alloc(&pbundle->bundle, uattr->len);
if (IS_ERR(p))
return PTR_ERR(p);
e->ptr_attr.ptr = p;
if (copy_from_user(p, u64_to_user_ptr(uattr->data),
uattr->len))
return -EFAULT;
} else {
e->ptr_attr.data = uattr->data;
}
break;
case UVERBS_ATTR_TYPE_IDR:
if (uattr->data >> 32)
return -EINVAL;
/* fall through */
case UVERBS_ATTR_TYPE_FD:
if (uattr->attr_data.reserved)
return -EINVAL;
if (uattr->len != 0 || !ucontext || uattr->data > INT_MAX)
if (uattr->len != 0)
return -EINVAL;
o_attr = &e->obj_attr;
object = uverbs_get_object(ibdev, spec->obj.obj_type);
if (!object)
return -EINVAL;
o_attr->type = object->type_attrs;
o_attr->id = (int)uattr->data;
o_attr->uobject = uverbs_get_uobject_from_context(
o_attr->type,
ucontext,
spec->obj.access,
o_attr->id);
o_attr->attr_elm = attr_uapi;
/*
* The type of uattr->data is u64 for UVERBS_ATTR_TYPE_IDR and
* s64 for UVERBS_ATTR_TYPE_FD. We can cast the u64 to s64
* here without caring about truncation as we know that the
* IDR implementation today rejects negative IDs
*/
o_attr->uobject = uverbs_get_uobject_from_file(
spec->u.obj.obj_type,
pbundle->bundle.ufile,
spec->u.obj.access,
uattr->data_s64);
if (IS_ERR(o_attr->uobject))
return PTR_ERR(o_attr->uobject);
__set_bit(attr_bkey, pbundle->uobj_finalize);
if (spec->obj.access == UVERBS_ACCESS_NEW) {
u64 id = o_attr->uobject->id;
if (spec->u.obj.access == UVERBS_ACCESS_NEW) {
unsigned int uattr_idx = uattr - pbundle->uattrs;
s64 id = o_attr->uobject->id;
/* Copy the allocated id to the user-space */
if (put_user(id, &e->uattr->data)) {
uverbs_finalize_object(o_attr->uobject,
UVERBS_ACCESS_NEW,
false);
if (put_user(id, &pbundle->user_attrs[uattr_idx].data))
return -EFAULT;
}
}
break;
default:
return -EOPNOTSUPP;
}
set_bit(attr_id, attr_bundle_h->valid_bitmap);
return 0;
}
static int uverbs_uattrs_process(struct ib_device *ibdev,
struct ib_ucontext *ucontext,
const struct ib_uverbs_attr *uattrs,
size_t num_uattrs,
const struct uverbs_method_spec *method,
struct uverbs_attr_bundle *attr_bundle,
struct ib_uverbs_attr __user *uattr_ptr)
{
size_t i;
int ret = 0;
int num_given_buckets = 0;
for (i = 0; i < num_uattrs; i++) {
const struct ib_uverbs_attr *uattr = &uattrs[i];
u16 attr_id = uattr->attr_id;
struct uverbs_attr_spec_hash *attr_spec_bucket;
ret = uverbs_ns_idx(&attr_id, method->num_buckets);
if (ret < 0) {
if (uattr->flags & UVERBS_ATTR_F_MANDATORY) {
uverbs_finalize_objects(attr_bundle,
method->attr_buckets,
num_given_buckets,
false);
return ret;
}
continue;
}
/*
* ret is the found ns, so increase num_given_buckets if
* necessary.
/*
* We search the radix tree with the method prefix and now we want to fast
* search the suffix bits to get a particular attribute pointer. It is not
* totally clear to me if this breaks the radix tree encasulation or not, but
* it uses the iter data to determine if the method iter points at the same
* chunk that will store the attribute, if so it just derefs it directly. By
* construction in most kernel configs the method and attrs will all fit in a
* single radix chunk, so in most cases this will have no search. Other cases
* this falls back to a full search.
*/
if (ret >= num_given_buckets)
num_given_buckets = ret + 1;
static void __rcu **uapi_get_attr_for_method(struct bundle_priv *pbundle,
u32 attr_key)
{
void __rcu **slot;
attr_spec_bucket = method->attr_buckets[ret];
ret = uverbs_process_attr(ibdev, ucontext, uattr, attr_id,
attr_spec_bucket, &attr_bundle->hash[ret],
uattr_ptr++);
if (ret) {
uverbs_finalize_objects(attr_bundle,
method->attr_buckets,
num_given_buckets,
false);
return ret;
}
if (likely(attr_key < pbundle->radix_slots_len)) {
void *entry;
slot = pbundle->radix_slots + attr_key;
entry = rcu_dereference_raw(*slot);
if (likely(!radix_tree_is_internal_node(entry) && entry))
return slot;
}
return num_given_buckets;
return radix_tree_lookup_slot(pbundle->radix,
pbundle->method_key | attr_key);
}
static int uverbs_validate_kernel_mandatory(const struct uverbs_method_spec *method_spec,
struct uverbs_attr_bundle *attr_bundle)
{
unsigned int i;
for (i = 0; i < attr_bundle->num_buckets; i++) {
struct uverbs_attr_spec_hash *attr_spec_bucket =
method_spec->attr_buckets[i];
if (!bitmap_subset(attr_spec_bucket->mandatory_attrs_bitmask,
attr_bundle->hash[i].valid_bitmap,
attr_spec_bucket->num_attrs))
return -EINVAL;
}
for (; i < method_spec->num_buckets; i++) {
struct uverbs_attr_spec_hash *attr_spec_bucket =
method_spec->attr_buckets[i];
if (!bitmap_empty(attr_spec_bucket->mandatory_attrs_bitmask,
attr_spec_bucket->num_attrs))
return -EINVAL;
}
return 0;
}
static int uverbs_handle_method(struct ib_uverbs_attr __user *uattr_ptr,
const struct ib_uverbs_attr *uattrs,
size_t num_uattrs,
struct ib_device *ibdev,
struct ib_uverbs_file *ufile,
const struct uverbs_method_spec *method_spec,
struct uverbs_attr_bundle *attr_bundle)
static int uverbs_set_attr(struct bundle_priv *pbundle,
struct ib_uverbs_attr *uattr)
{
u32 attr_key = uapi_key_attr(uattr->attr_id);
u32 attr_bkey = uapi_bkey_attr(attr_key);
const struct uverbs_api_attr *attr;
void __rcu **slot;
int ret;
int finalize_ret;
int num_given_buckets;
num_given_buckets = uverbs_uattrs_process(ibdev, ufile->ucontext, uattrs,
num_uattrs, method_spec,
attr_bundle, uattr_ptr);
if (num_given_buckets <= 0)
slot = uapi_get_attr_for_method(pbundle, attr_key);
if (!slot) {
/*
* Kernel does not support the attribute but user-space says it
* is mandatory
*/
if (uattr->flags & UVERBS_ATTR_F_MANDATORY)
return -EPROTONOSUPPORT;
return 0;
}
attr = srcu_dereference(
*slot, &pbundle->bundle.ufile->device->disassociate_srcu);
/* Reject duplicate attributes from user-space */
if (test_bit(attr_bkey, pbundle->bundle.attr_present))
return -EINVAL;
attr_bundle->num_buckets = num_given_buckets;
ret = uverbs_validate_kernel_mandatory(method_spec, attr_bundle);
ret = uverbs_process_attr(pbundle, attr, uattr, attr_bkey);
if (ret)
goto cleanup;
return ret;
ret = method_spec->handler(ibdev, ufile, attr_bundle);
cleanup:
finalize_ret = uverbs_finalize_objects(attr_bundle,
method_spec->attr_buckets,
attr_bundle->num_buckets,
!ret);
__set_bit(attr_bkey, pbundle->bundle.attr_present);
return ret ? ret : finalize_ret;
return 0;
}
#define UVERBS_OPTIMIZE_USING_STACK_SZ 256
static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct ib_uverbs_ioctl_hdr *hdr,
void __user *buf)
static int ib_uverbs_run_method(struct bundle_priv *pbundle,
unsigned int num_attrs)
{
const struct uverbs_object_spec *object_spec;
const struct uverbs_method_spec *method_spec;
long err = 0;
int (*handler)(struct ib_uverbs_file *ufile,
struct uverbs_attr_bundle *ctx);
size_t uattrs_size = array_size(sizeof(*pbundle->uattrs), num_attrs);
unsigned int destroy_bkey = pbundle->method_elm->destroy_bkey;
unsigned int i;
struct {
struct ib_uverbs_attr *uattrs;
struct uverbs_attr_bundle *uverbs_attr_bundle;
} *ctx = NULL;
struct uverbs_attr *curr_attr;
unsigned long *curr_bitmap;
size_t ctx_size;
uintptr_t data[UVERBS_OPTIMIZE_USING_STACK_SZ / sizeof(uintptr_t)];
int ret;
if (hdr->driver_id != ib_dev->driver_id)
return -EINVAL;
/* See uverbs_disassociate_api() */
handler = srcu_dereference(
pbundle->method_elm->handler,
&pbundle->bundle.ufile->device->disassociate_srcu);
if (!handler)
return -EIO;
object_spec = uverbs_get_object(ib_dev, hdr->object_id);
if (!object_spec)
return -EPROTONOSUPPORT;
pbundle->uattrs = uverbs_alloc(&pbundle->bundle, uattrs_size);
if (IS_ERR(pbundle->uattrs))
return PTR_ERR(pbundle->uattrs);
if (copy_from_user(pbundle->uattrs, pbundle->user_attrs, uattrs_size))
return -EFAULT;
method_spec = uverbs_get_method(object_spec, hdr->method_id);
if (!method_spec)
return -EPROTONOSUPPORT;
if ((method_spec->flags & UVERBS_ACTION_FLAG_CREATE_ROOT) ^ !file->ucontext)
return -EINVAL;
ctx_size = sizeof(*ctx) +
sizeof(struct uverbs_attr_bundle) +
sizeof(struct uverbs_attr_bundle_hash) * method_spec->num_buckets +
sizeof(*ctx->uattrs) * hdr->num_attrs +
sizeof(*ctx->uverbs_attr_bundle->hash[0].attrs) *
method_spec->num_child_attrs +
sizeof(*ctx->uverbs_attr_bundle->hash[0].valid_bitmap) *
(method_spec->num_child_attrs / BITS_PER_LONG +
method_spec->num_buckets);
if (ctx_size <= UVERBS_OPTIMIZE_USING_STACK_SZ)
ctx = (void *)data;
if (!ctx)
ctx = kmalloc(ctx_size, GFP_KERNEL);
if (!ctx)
return -ENOMEM;
ctx->uverbs_attr_bundle = (void *)ctx + sizeof(*ctx);
ctx->uattrs = (void *)(ctx->uverbs_attr_bundle + 1) +
(sizeof(ctx->uverbs_attr_bundle->hash[0]) *
method_spec->num_buckets);
curr_attr = (void *)(ctx->uattrs + hdr->num_attrs);
curr_bitmap = (void *)(curr_attr + method_spec->num_child_attrs);
/*
* We just fill the pointers and num_attrs here. The data itself will be
* filled at a later stage (uverbs_process_attr)
*/
for (i = 0; i < method_spec->num_buckets; i++) {
unsigned int curr_num_attrs = method_spec->attr_buckets[i]->num_attrs;
ctx->uverbs_attr_bundle->hash[i].attrs = curr_attr;
curr_attr += curr_num_attrs;
ctx->uverbs_attr_bundle->hash[i].num_attrs = curr_num_attrs;
ctx->uverbs_attr_bundle->hash[i].valid_bitmap = curr_bitmap;
bitmap_zero(curr_bitmap, curr_num_attrs);
curr_bitmap += BITS_TO_LONGS(curr_num_attrs);
for (i = 0; i != num_attrs; i++) {
ret = uverbs_set_attr(pbundle, &pbundle->uattrs[i]);
if (unlikely(ret))
return ret;
}
err = copy_from_user(ctx->uattrs, buf,
sizeof(*ctx->uattrs) * hdr->num_attrs);
if (err) {
err = -EFAULT;
goto out;
}
/* User space did not provide all the mandatory attributes */
if (unlikely(!bitmap_subset(pbundle->method_elm->attr_mandatory,
pbundle->bundle.attr_present,
pbundle->method_elm->key_bitmap_len)))
return -EINVAL;
err = uverbs_handle_method(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
file, method_spec, ctx->uverbs_attr_bundle);
if (destroy_bkey != UVERBS_API_ATTR_BKEY_LEN) {
struct uverbs_obj_attr *destroy_attr =
&pbundle->bundle.attrs[destroy_bkey].obj_attr;
ret = uobj_destroy(destroy_attr->uobject);
if (ret)
return ret;
__clear_bit(destroy_bkey, pbundle->uobj_finalize);
ret = handler(pbundle->bundle.ufile, &pbundle->bundle);
uobj_put_destroy(destroy_attr->uobject);
} else {
ret = handler(pbundle->bundle.ufile, &pbundle->bundle);
}
/*
* EPROTONOSUPPORT is ONLY to be returned if the ioctl framework can
* not invoke the method because the request is not supported. No
* other cases should return this code.
*/
if (unlikely(err == -EPROTONOSUPPORT)) {
WARN_ON_ONCE(err == -EPROTONOSUPPORT);
err = -EINVAL;
}
out:
if (ctx != (void *)data)
kfree(ctx);
return err;
if (WARN_ON_ONCE(ret == -EPROTONOSUPPORT))
return -EINVAL;
return ret;
}
#define IB_UVERBS_MAX_CMD_SZ 4096
static int bundle_destroy(struct bundle_priv *pbundle, bool commit)
{
unsigned int key_bitmap_len = pbundle->method_elm->key_bitmap_len;
struct bundle_alloc_head *memblock;
unsigned int i;
int ret = 0;
i = -1;
while ((i = find_next_bit(pbundle->uobj_finalize, key_bitmap_len,
i + 1)) < key_bitmap_len) {
struct uverbs_attr *attr = &pbundle->bundle.attrs[i];
int current_ret;
current_ret = uverbs_finalize_object(
attr->obj_attr.uobject,
attr->obj_attr.attr_elm->spec.u.obj.access, commit);
if (!ret)
ret = current_ret;
}
for (memblock = pbundle->allocated_mem; memblock;) {
struct bundle_alloc_head *tmp = memblock;
memblock = memblock->next;
kvfree(tmp);
}
return ret;
}
static int ib_uverbs_cmd_verbs(struct ib_uverbs_file *ufile,
struct ib_uverbs_ioctl_hdr *hdr,
struct ib_uverbs_attr __user *user_attrs)
{
const struct uverbs_api_ioctl_method *method_elm;
struct uverbs_api *uapi = ufile->device->uapi;
struct radix_tree_iter attrs_iter;
struct bundle_priv *pbundle;
struct bundle_priv onstack;
void __rcu **slot;
int destroy_ret;
int ret;
if (unlikely(hdr->driver_id != uapi->driver_id))
return -EINVAL;
slot = radix_tree_iter_lookup(
&uapi->radix, &attrs_iter,
uapi_key_obj(hdr->object_id) |
uapi_key_ioctl_method(hdr->method_id));
if (unlikely(!slot))
return -EPROTONOSUPPORT;
method_elm = srcu_dereference(*slot, &ufile->device->disassociate_srcu);
if (!method_elm->use_stack) {
pbundle = kmalloc(method_elm->bundle_size, GFP_KERNEL);
if (!pbundle)
return -ENOMEM;
pbundle->internal_avail =
method_elm->bundle_size -
offsetof(struct bundle_priv, internal_buffer);
pbundle->alloc_head.next = NULL;
pbundle->allocated_mem = &pbundle->alloc_head;
} else {
pbundle = &onstack;
pbundle->internal_avail = sizeof(pbundle->internal_buffer);
pbundle->allocated_mem = NULL;
}
/* Space for the pbundle->bundle.attrs flex array */
pbundle->method_elm = method_elm;
pbundle->method_key = attrs_iter.index;
pbundle->bundle.ufile = ufile;
pbundle->radix = &uapi->radix;
pbundle->radix_slots = slot;
pbundle->radix_slots_len = radix_tree_chunk_size(&attrs_iter);
pbundle->user_attrs = user_attrs;
pbundle->internal_used = ALIGN(pbundle->method_elm->key_bitmap_len *
sizeof(*pbundle->bundle.attrs),
sizeof(*pbundle->internal_buffer));
memset(pbundle->bundle.attr_present, 0,
sizeof(pbundle->bundle.attr_present));
memset(pbundle->uobj_finalize, 0, sizeof(pbundle->uobj_finalize));
ret = ib_uverbs_run_method(pbundle, hdr->num_attrs);
destroy_ret = bundle_destroy(pbundle, ret == 0);
if (unlikely(destroy_ret && !ret))
return destroy_ret;
return ret;
}
long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
@ -384,39 +476,138 @@ long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
struct ib_uverbs_ioctl_hdr __user *user_hdr =
(struct ib_uverbs_ioctl_hdr __user *)arg;
struct ib_uverbs_ioctl_hdr hdr;
struct ib_device *ib_dev;
int srcu_key;
long err;
int err;
if (unlikely(cmd != RDMA_VERBS_IOCTL))
return -ENOIOCTLCMD;
err = copy_from_user(&hdr, user_hdr, sizeof(hdr));
if (err)
return -EFAULT;
if (hdr.length > PAGE_SIZE ||
hdr.length != struct_size(&hdr, attrs, hdr.num_attrs))
return -EINVAL;
if (hdr.reserved1 || hdr.reserved2)
return -EPROTONOSUPPORT;
srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
ib_dev = srcu_dereference(file->device->ib_dev,
&file->device->disassociate_srcu);
if (!ib_dev) {
err = -EIO;
goto out;
}
if (cmd == RDMA_VERBS_IOCTL) {
err = copy_from_user(&hdr, user_hdr, sizeof(hdr));
if (err || hdr.length > IB_UVERBS_MAX_CMD_SZ ||
hdr.length != sizeof(hdr) + hdr.num_attrs * sizeof(struct ib_uverbs_attr)) {
err = -EINVAL;
goto out;
}
if (hdr.reserved1 || hdr.reserved2) {
err = -EPROTONOSUPPORT;
goto out;
}
err = ib_uverbs_cmd_verbs(ib_dev, file, &hdr,
(__user void *)arg + sizeof(hdr));
} else {
err = -ENOIOCTLCMD;
}
out:
err = ib_uverbs_cmd_verbs(file, &hdr, user_hdr->attrs);
srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
return err;
}
int uverbs_get_flags64(u64 *to, const struct uverbs_attr_bundle *attrs_bundle,
size_t idx, u64 allowed_bits)
{
const struct uverbs_attr *attr;
u64 flags;
attr = uverbs_attr_get(attrs_bundle, idx);
/* Missing attribute means 0 flags */
if (IS_ERR(attr)) {
*to = 0;
return 0;
}
/*
* New userspace code should use 8 bytes to pass flags, but we
* transparently support old userspaces that were using 4 bytes as
* well.
*/
if (attr->ptr_attr.len == 8)
flags = attr->ptr_attr.data;
else if (attr->ptr_attr.len == 4)
flags = *(u32 *)&attr->ptr_attr.data;
else
return -EINVAL;
if (flags & ~allowed_bits)
return -EINVAL;
*to = flags;
return 0;
}
EXPORT_SYMBOL(uverbs_get_flags64);
int uverbs_get_flags32(u32 *to, const struct uverbs_attr_bundle *attrs_bundle,
size_t idx, u64 allowed_bits)
{
u64 flags;
int ret;
ret = uverbs_get_flags64(&flags, attrs_bundle, idx, allowed_bits);
if (ret)
return ret;
if (flags > U32_MAX)
return -EINVAL;
*to = flags;
return 0;
}
EXPORT_SYMBOL(uverbs_get_flags32);
/*
* This is for ease of conversion. The purpose is to convert all drivers to
* use uverbs_attr_bundle instead of ib_udata. Assume attr == 0 is input and
* attr == 1 is output.
*/
void create_udata(struct uverbs_attr_bundle *bundle, struct ib_udata *udata)
{
struct bundle_priv *pbundle =
container_of(bundle, struct bundle_priv, bundle);
const struct uverbs_attr *uhw_in =
uverbs_attr_get(bundle, UVERBS_ATTR_UHW_IN);
const struct uverbs_attr *uhw_out =
uverbs_attr_get(bundle, UVERBS_ATTR_UHW_OUT);
if (!IS_ERR(uhw_in)) {
udata->inlen = uhw_in->ptr_attr.len;
if (uverbs_attr_ptr_is_inline(uhw_in))
udata->inbuf =
&pbundle->user_attrs[uhw_in->ptr_attr.uattr_idx]
.data;
else
udata->inbuf = u64_to_user_ptr(uhw_in->ptr_attr.data);
} else {
udata->inbuf = NULL;
udata->inlen = 0;
}
if (!IS_ERR(uhw_out)) {
udata->outbuf = u64_to_user_ptr(uhw_out->ptr_attr.data);
udata->outlen = uhw_out->ptr_attr.len;
} else {
udata->outbuf = NULL;
udata->outlen = 0;
}
}
int uverbs_copy_to(const struct uverbs_attr_bundle *bundle, size_t idx,
const void *from, size_t size)
{
struct bundle_priv *pbundle =
container_of(bundle, struct bundle_priv, bundle);
const struct uverbs_attr *attr = uverbs_attr_get(bundle, idx);
u16 flags;
size_t min_size;
if (IS_ERR(attr))
return PTR_ERR(attr);
min_size = min_t(size_t, attr->ptr_attr.len, size);
if (copy_to_user(u64_to_user_ptr(attr->ptr_attr.data), from, min_size))
return -EFAULT;
flags = pbundle->uattrs[attr->ptr_attr.uattr_idx].flags |
UVERBS_ATTR_F_VALID_OUTPUT;
if (put_user(flags,
&pbundle->user_attrs[attr->ptr_attr.uattr_idx].flags))
return -EFAULT;
return 0;
}
EXPORT_SYMBOL(uverbs_copy_to);

View File

@ -1,664 +0,0 @@
/*
* Copyright (c) 2017, Mellanox Technologies inc. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#include <rdma/uverbs_ioctl.h>
#include <rdma/rdma_user_ioctl.h>
#include <linux/bitops.h>
#include "uverbs.h"
#define UVERBS_NUM_NS (UVERBS_ID_NS_MASK >> UVERBS_ID_NS_SHIFT)
#define GET_NS_ID(idx) (((idx) & UVERBS_ID_NS_MASK) >> UVERBS_ID_NS_SHIFT)
#define GET_ID(idx) ((idx) & ~UVERBS_ID_NS_MASK)
#define _for_each_element(elem, tmpi, tmpj, hashes, num_buckets_offset, \
buckets_offset) \
for (tmpj = 0, \
elem = (*(const void ***)((hashes)[tmpi] + \
(buckets_offset)))[0]; \
tmpj < *(size_t *)((hashes)[tmpi] + (num_buckets_offset)); \
tmpj++) \
if ((elem = ((*(const void ***)(hashes[tmpi] + \
(buckets_offset)))[tmpj])))
/*
* Iterate all elements of a few @hashes. The number of given hashes is
* indicated by @num_hashes. The offset of the number of buckets in the hash is
* represented by @num_buckets_offset, while the offset of the buckets array in
* the hash structure is represented by @buckets_offset. tmpi and tmpj are two
* short (or int) based indices that are given by the user. tmpi iterates over
* the different hashes. @elem points the current element in the hashes[tmpi]
* bucket we are looping on. To be honest, @hashes representation isn't exactly
* a hash, but more a collection of elements. These elements' ids are treated
* in a hash like manner, where the first upper bits are the bucket number.
* These elements are later mapped into a perfect-hash.
*/
#define for_each_element(elem, tmpi, tmpj, hashes, num_hashes, \
num_buckets_offset, buckets_offset) \
for (tmpi = 0; tmpi < (num_hashes); tmpi++) \
_for_each_element(elem, tmpi, tmpj, hashes, num_buckets_offset,\
buckets_offset)
#define get_elements_iterators_entry_above(iters, num_elements, elements, \
num_objects_fld, objects_fld, bucket,\
min_id) \
get_elements_above_id((const void **)iters, num_elements, \
(const void **)(elements), \
offsetof(typeof(**elements), \
num_objects_fld), \
offsetof(typeof(**elements), objects_fld),\
offsetof(typeof(***(*elements)->objects_fld), id),\
bucket, min_id)
#define get_objects_above_id(iters, num_trees, trees, bucket, min_id) \
get_elements_iterators_entry_above(iters, num_trees, trees, \
num_objects, objects, bucket, min_id)
#define get_methods_above_id(method_iters, num_iters, iters, bucket, min_id)\
get_elements_iterators_entry_above(method_iters, num_iters, iters, \
num_methods, methods, bucket, min_id)
#define get_attrs_above_id(attrs_iters, num_iters, iters, bucket, min_id)\
get_elements_iterators_entry_above(attrs_iters, num_iters, iters, \
num_attrs, attrs, bucket, min_id)
/*
* get_elements_above_id get a few hashes represented by @elements and
* @num_elements. The hashes fields are described by @num_offset, @data_offset
* and @id_offset in the same way as required by for_each_element. The function
* returns an array of @iters, represents an array of elements in the hashes
* buckets, which their ids are the smallest ids in all hashes but are all
* larger than the id given by min_id. Elements are only added to the iters
* array if their id belongs to the bucket @bucket. The number of elements in
* the returned array is returned by the function. @min_id is also updated to
* reflect the new min_id of all elements in iters.
*/
static size_t get_elements_above_id(const void **iters,
unsigned int num_elements,
const void **elements,
size_t num_offset,
size_t data_offset,
size_t id_offset,
u16 bucket,
short *min_id)
{
size_t num_iters = 0;
short min = SHRT_MAX;
const void *elem;
int i, j, last_stored = -1;
unsigned int equal_min = 0;
for_each_element(elem, i, j, elements, num_elements, num_offset,
data_offset) {
u16 id = *(u16 *)(elem + id_offset);
if (GET_NS_ID(id) != bucket)
continue;
if (GET_ID(id) < *min_id ||
(min != SHRT_MAX && GET_ID(id) > min))
continue;
/*
* We first iterate all hashes represented by @elements. When
* we do, we try to find an element @elem in the bucket @bucket
* which its id is min. Since we can't ensure the user sorted
* the elements in increasing order, we override this hash's
* minimal id element we found, if a new element with a smaller
* id was just found.
*/
iters[last_stored == i ? num_iters - 1 : num_iters++] = elem;
last_stored = i;
if (min == GET_ID(id))
equal_min++;
else
equal_min = 1;
min = GET_ID(id);
}
/*
* We only insert to our iters array an element, if its id is smaller
* than all previous ids. Therefore, the final iters array is sorted so
* that smaller ids are in the end of the array.
* Therefore, we need to clean the beginning of the array to make sure
* all ids of final elements are equal to min.
*/
memmove(iters, iters + num_iters - equal_min, sizeof(*iters) * equal_min);
*min_id = min;
return equal_min;
}
#define find_max_element_entry_id(num_elements, elements, num_objects_fld, \
objects_fld, bucket) \
find_max_element_id(num_elements, (const void **)(elements), \
offsetof(typeof(**elements), num_objects_fld), \
offsetof(typeof(**elements), objects_fld), \
offsetof(typeof(***(*elements)->objects_fld), id),\
bucket)
static short find_max_element_ns_id(unsigned int num_elements,
const void **elements,
size_t num_offset,
size_t data_offset,
size_t id_offset)
{
short max_ns = SHRT_MIN;
const void *elem;
int i, j;
for_each_element(elem, i, j, elements, num_elements, num_offset,
data_offset) {
u16 id = *(u16 *)(elem + id_offset);
if (GET_NS_ID(id) > max_ns)
max_ns = GET_NS_ID(id);
}
return max_ns;
}
static short find_max_element_id(unsigned int num_elements,
const void **elements,
size_t num_offset,
size_t data_offset,
size_t id_offset,
u16 bucket)
{
short max_id = SHRT_MIN;
const void *elem;
int i, j;
for_each_element(elem, i, j, elements, num_elements, num_offset,
data_offset) {
u16 id = *(u16 *)(elem + id_offset);
if (GET_NS_ID(id) == bucket &&
GET_ID(id) > max_id)
max_id = GET_ID(id);
}
return max_id;
}
#define find_max_element_entry_id(num_elements, elements, num_objects_fld, \
objects_fld, bucket) \
find_max_element_id(num_elements, (const void **)(elements), \
offsetof(typeof(**elements), num_objects_fld), \
offsetof(typeof(**elements), objects_fld), \
offsetof(typeof(***(*elements)->objects_fld), id),\
bucket)
#define find_max_element_ns_entry_id(num_elements, elements, \
num_objects_fld, objects_fld) \
find_max_element_ns_id(num_elements, (const void **)(elements), \
offsetof(typeof(**elements), num_objects_fld),\
offsetof(typeof(**elements), objects_fld), \
offsetof(typeof(***(*elements)->objects_fld), id))
/*
* find_max_xxxx_ns_id gets a few elements. Each element is described by an id
* which its upper bits represents a namespace. It finds the max namespace. This
* could be used in order to know how many buckets do we need to allocate. If no
* elements exist, SHRT_MIN is returned. Namespace represents here different
* buckets. The common example is "common bucket" and "driver bucket".
*
* find_max_xxxx_id gets a few elements and a bucket. Each element is described
* by an id which its upper bits represent a namespace. It returns the max id
* which is contained in the same namespace defined in @bucket. This could be
* used in order to know how many elements do we need to allocate in the bucket.
* If no elements exist, SHRT_MIN is returned.
*/
#define find_max_object_id(num_trees, trees, bucket) \
find_max_element_entry_id(num_trees, trees, num_objects,\
objects, bucket)
#define find_max_object_ns_id(num_trees, trees) \
find_max_element_ns_entry_id(num_trees, trees, \
num_objects, objects)
#define find_max_method_id(num_iters, iters, bucket) \
find_max_element_entry_id(num_iters, iters, num_methods,\
methods, bucket)
#define find_max_method_ns_id(num_iters, iters) \
find_max_element_ns_entry_id(num_iters, iters, \
num_methods, methods)
#define find_max_attr_id(num_iters, iters, bucket) \
find_max_element_entry_id(num_iters, iters, num_attrs, \
attrs, bucket)
#define find_max_attr_ns_id(num_iters, iters) \
find_max_element_ns_entry_id(num_iters, iters, \
num_attrs, attrs)
static void free_method(struct uverbs_method_spec *method)
{
unsigned int i;
if (!method)
return;
for (i = 0; i < method->num_buckets; i++)
kfree(method->attr_buckets[i]);
kfree(method);
}
#define IS_ATTR_OBJECT(attr) ((attr)->type == UVERBS_ATTR_TYPE_IDR || \
(attr)->type == UVERBS_ATTR_TYPE_FD)
/*
* This function gets array of size @num_method_defs which contains pointers to
* method definitions @method_defs. The function allocates an
* uverbs_method_spec structure and initializes its number of buckets and the
* elements in buckets to the correct attributes. While doing that, it
* validates that there aren't conflicts between attributes of different
* method_defs.
*/
static struct uverbs_method_spec *build_method_with_attrs(const struct uverbs_method_def **method_defs,
size_t num_method_defs)
{
int bucket_idx;
int max_attr_buckets = 0;
size_t num_attr_buckets = 0;
int res = 0;
struct uverbs_method_spec *method = NULL;
const struct uverbs_attr_def **attr_defs;
unsigned int num_of_singularities = 0;
max_attr_buckets = find_max_attr_ns_id(num_method_defs, method_defs);
if (max_attr_buckets >= 0)
num_attr_buckets = max_attr_buckets + 1;
method = kzalloc(struct_size(method, attr_buckets, num_attr_buckets),
GFP_KERNEL);
if (!method)
return ERR_PTR(-ENOMEM);
method->num_buckets = num_attr_buckets;
attr_defs = kcalloc(num_method_defs, sizeof(*attr_defs), GFP_KERNEL);
if (!attr_defs) {
res = -ENOMEM;
goto free_method;
}
for (bucket_idx = 0; bucket_idx < method->num_buckets; bucket_idx++) {
short min_id = SHRT_MIN;
int attr_max_bucket = 0;
struct uverbs_attr_spec_hash *hash = NULL;
attr_max_bucket = find_max_attr_id(num_method_defs, method_defs,
bucket_idx);
if (attr_max_bucket < 0)
continue;
hash = kzalloc(sizeof(*hash) +
ALIGN(sizeof(*hash->attrs) * (attr_max_bucket + 1),
sizeof(long)) +
BITS_TO_LONGS(attr_max_bucket + 1) * sizeof(long),
GFP_KERNEL);
if (!hash) {
res = -ENOMEM;
goto free;
}
hash->num_attrs = attr_max_bucket + 1;
method->num_child_attrs += hash->num_attrs;
hash->mandatory_attrs_bitmask = (void *)(hash + 1) +
ALIGN(sizeof(*hash->attrs) *
(attr_max_bucket + 1),
sizeof(long));
method->attr_buckets[bucket_idx] = hash;
do {
size_t num_attr_defs;
struct uverbs_attr_spec *attr;
bool attr_obj_with_special_access;
num_attr_defs =
get_attrs_above_id(attr_defs,
num_method_defs,
method_defs,
bucket_idx,
&min_id);
/* Last attr in bucket */
if (!num_attr_defs)
break;
if (num_attr_defs > 1) {
/*
* We don't allow two attribute definitions for
* the same attribute. This is usually a
* programmer error. If required, it's better to
* just add a new attribute to capture the new
* semantics.
*/
res = -EEXIST;
goto free;
}
attr = &hash->attrs[min_id];
memcpy(attr, &attr_defs[0]->attr, sizeof(*attr));
attr_obj_with_special_access = IS_ATTR_OBJECT(attr) &&
(attr->obj.access == UVERBS_ACCESS_NEW ||
attr->obj.access == UVERBS_ACCESS_DESTROY);
num_of_singularities += !!attr_obj_with_special_access;
if (WARN(num_of_singularities > 1,
"ib_uverbs: Method contains more than one object attr (%d) with new/destroy access\n",
min_id) ||
WARN(attr_obj_with_special_access &&
!(attr->flags & UVERBS_ATTR_SPEC_F_MANDATORY),
"ib_uverbs: Tried to merge attr (%d) but it's an object with new/destroy access but isn't mandatory\n",
min_id) ||
WARN(IS_ATTR_OBJECT(attr) &&
attr->flags & UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO,
"ib_uverbs: Tried to merge attr (%d) but it's an object with min_sz flag\n",
min_id)) {
res = -EINVAL;
goto free;
}
if (attr->flags & UVERBS_ATTR_SPEC_F_MANDATORY)
set_bit(min_id, hash->mandatory_attrs_bitmask);
min_id++;
} while (1);
}
kfree(attr_defs);
return method;
free:
kfree(attr_defs);
free_method:
free_method(method);
return ERR_PTR(res);
}
static void free_object(struct uverbs_object_spec *object)
{
unsigned int i, j;
if (!object)
return;
for (i = 0; i < object->num_buckets; i++) {
struct uverbs_method_spec_hash *method_buckets =
object->method_buckets[i];
if (!method_buckets)
continue;
for (j = 0; j < method_buckets->num_methods; j++)
free_method(method_buckets->methods[j]);
kfree(method_buckets);
}
kfree(object);
}
/*
* This function gets array of size @num_object_defs which contains pointers to
* object definitions @object_defs. The function allocated an
* uverbs_object_spec structure and initialize its number of buckets and the
* elements in buckets to the correct methods. While doing that, it
* sorts out the correct relationship between conflicts in the same method.
*/
static struct uverbs_object_spec *build_object_with_methods(const struct uverbs_object_def **object_defs,
size_t num_object_defs)
{
u16 bucket_idx;
int max_method_buckets = 0;
u16 num_method_buckets = 0;
int res = 0;
struct uverbs_object_spec *object = NULL;
const struct uverbs_method_def **method_defs;
max_method_buckets = find_max_method_ns_id(num_object_defs, object_defs);
if (max_method_buckets >= 0)
num_method_buckets = max_method_buckets + 1;
object = kzalloc(struct_size(object, method_buckets,
num_method_buckets),
GFP_KERNEL);
if (!object)
return ERR_PTR(-ENOMEM);
object->num_buckets = num_method_buckets;
method_defs = kcalloc(num_object_defs, sizeof(*method_defs), GFP_KERNEL);
if (!method_defs) {
res = -ENOMEM;
goto free_object;
}
for (bucket_idx = 0; bucket_idx < object->num_buckets; bucket_idx++) {
short min_id = SHRT_MIN;
int methods_max_bucket = 0;
struct uverbs_method_spec_hash *hash = NULL;
methods_max_bucket = find_max_method_id(num_object_defs, object_defs,
bucket_idx);
if (methods_max_bucket < 0)
continue;
hash = kzalloc(struct_size(hash, methods,
methods_max_bucket + 1),
GFP_KERNEL);
if (!hash) {
res = -ENOMEM;
goto free;
}
hash->num_methods = methods_max_bucket + 1;
object->method_buckets[bucket_idx] = hash;
do {
size_t num_method_defs;
struct uverbs_method_spec *method;
int i;
num_method_defs =
get_methods_above_id(method_defs,
num_object_defs,
object_defs,
bucket_idx,
&min_id);
/* Last method in bucket */
if (!num_method_defs)
break;
method = build_method_with_attrs(method_defs,
num_method_defs);
if (IS_ERR(method)) {
res = PTR_ERR(method);
goto free;
}
/*
* The last tree which is given as an argument to the
* merge overrides previous method handler.
* Therefore, we iterate backwards and search for the
* first handler which != NULL. This also defines the
* set of flags used for this handler.
*/
for (i = num_method_defs - 1;
i >= 0 && !method_defs[i]->handler; i--)
;
hash->methods[min_id++] = method;
/* NULL handler isn't allowed */
if (WARN(i < 0,
"ib_uverbs: tried to merge function id %d, but all handlers are NULL\n",
min_id)) {
res = -EINVAL;
goto free;
}
method->handler = method_defs[i]->handler;
method->flags = method_defs[i]->flags;
} while (1);
}
kfree(method_defs);
return object;
free:
kfree(method_defs);
free_object:
free_object(object);
return ERR_PTR(res);
}
void uverbs_free_spec_tree(struct uverbs_root_spec *root)
{
unsigned int i, j;
if (!root)
return;
for (i = 0; i < root->num_buckets; i++) {
struct uverbs_object_spec_hash *object_hash =
root->object_buckets[i];
if (!object_hash)
continue;
for (j = 0; j < object_hash->num_objects; j++)
free_object(object_hash->objects[j]);
kfree(object_hash);
}
kfree(root);
}
EXPORT_SYMBOL(uverbs_free_spec_tree);
struct uverbs_root_spec *uverbs_alloc_spec_tree(unsigned int num_trees,
const struct uverbs_object_tree_def **trees)
{
u16 bucket_idx;
short max_object_buckets = 0;
size_t num_objects_buckets = 0;
struct uverbs_root_spec *root_spec = NULL;
const struct uverbs_object_def **object_defs;
int i;
int res = 0;
max_object_buckets = find_max_object_ns_id(num_trees, trees);
/*
* Devices which don't want to support ib_uverbs, should just allocate
* an empty parsing tree. Every user-space command won't hit any valid
* entry in the parsing tree and thus will fail.
*/
if (max_object_buckets >= 0)
num_objects_buckets = max_object_buckets + 1;
root_spec = kzalloc(struct_size(root_spec, object_buckets,
num_objects_buckets),
GFP_KERNEL);
if (!root_spec)
return ERR_PTR(-ENOMEM);
root_spec->num_buckets = num_objects_buckets;
object_defs = kcalloc(num_trees, sizeof(*object_defs),
GFP_KERNEL);
if (!object_defs) {
res = -ENOMEM;
goto free_root;
}
for (bucket_idx = 0; bucket_idx < root_spec->num_buckets; bucket_idx++) {
short min_id = SHRT_MIN;
short objects_max_bucket;
struct uverbs_object_spec_hash *hash = NULL;
objects_max_bucket = find_max_object_id(num_trees, trees,
bucket_idx);
if (objects_max_bucket < 0)
continue;
hash = kzalloc(struct_size(hash, objects,
objects_max_bucket + 1),
GFP_KERNEL);
if (!hash) {
res = -ENOMEM;
goto free;
}
hash->num_objects = objects_max_bucket + 1;
root_spec->object_buckets[bucket_idx] = hash;
do {
size_t num_object_defs;
struct uverbs_object_spec *object;
num_object_defs = get_objects_above_id(object_defs,
num_trees,
trees,
bucket_idx,
&min_id);
/* Last object in bucket */
if (!num_object_defs)
break;
object = build_object_with_methods(object_defs,
num_object_defs);
if (IS_ERR(object)) {
res = PTR_ERR(object);
goto free;
}
/*
* The last tree which is given as an argument to the
* merge overrides previous object's type_attrs.
* Therefore, we iterate backwards and search for the
* first type_attrs which != NULL.
*/
for (i = num_object_defs - 1;
i >= 0 && !object_defs[i]->type_attrs; i--)
;
/*
* NULL is a valid type_attrs. It means an object we
* can't instantiate (like DEVICE).
*/
object->type_attrs = i < 0 ? NULL :
object_defs[i]->type_attrs;
hash->objects[min_id++] = object;
} while (1);
}
kfree(object_defs);
return root_spec;
free:
kfree(object_defs);
free_root:
uverbs_free_spec_tree(root_spec);
return ERR_PTR(res);
}
EXPORT_SYMBOL(uverbs_alloc_spec_tree);

View File

@ -41,8 +41,6 @@
#include <linux/fs.h>
#include <linux/poll.h>
#include <linux/sched.h>
#include <linux/sched/mm.h>
#include <linux/sched/task.h>
#include <linux/file.h>
#include <linux/cdev.h>
#include <linux/anon_inodes.h>
@ -77,7 +75,6 @@ static struct class *uverbs_class;
static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
struct ib_device *ib_dev,
const char __user *buf, int in_len,
int out_len) = {
[IB_USER_VERBS_CMD_GET_CONTEXT] = ib_uverbs_get_context,
@ -118,7 +115,6 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
};
static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
struct ib_device *ib_dev,
struct ib_udata *ucore,
struct ib_udata *uhw) = {
[IB_USER_VERBS_EX_CMD_CREATE_FLOW] = ib_uverbs_ex_create_flow,
@ -138,6 +134,30 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
static void ib_uverbs_add_one(struct ib_device *device);
static void ib_uverbs_remove_one(struct ib_device *device, void *client_data);
/*
* Must be called with the ufile->device->disassociate_srcu held, and the lock
* must be held until use of the ucontext is finished.
*/
struct ib_ucontext *ib_uverbs_get_ucontext(struct ib_uverbs_file *ufile)
{
/*
* We do not hold the hw_destroy_rwsem lock for this flow, instead
* srcu is used. It does not matter if someone races this with
* get_context, we get NULL or valid ucontext.
*/
struct ib_ucontext *ucontext = smp_load_acquire(&ufile->ucontext);
if (!srcu_dereference(ufile->device->ib_dev,
&ufile->device->disassociate_srcu))
return ERR_PTR(-EIO);
if (!ucontext)
return ERR_PTR(-EINVAL);
return ucontext;
}
EXPORT_SYMBOL(ib_uverbs_get_ucontext);
int uverbs_dealloc_mw(struct ib_mw *mw)
{
struct ib_pd *pd = mw->pd;
@ -154,6 +174,7 @@ static void ib_uverbs_release_dev(struct kobject *kobj)
struct ib_uverbs_device *dev =
container_of(kobj, struct ib_uverbs_device, kobj);
uverbs_destroy_api(dev->uapi);
cleanup_srcu_struct(&dev->disassociate_srcu);
kfree(dev);
}
@ -184,7 +205,7 @@ void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
}
spin_unlock_irq(&ev_file->ev_queue.lock);
uverbs_uobject_put(&ev_file->uobj_file.uobj);
uverbs_uobject_put(&ev_file->uobj);
}
spin_lock_irq(&file->async_file->ev_queue.lock);
@ -220,20 +241,6 @@ void ib_uverbs_detach_umcast(struct ib_qp *qp,
}
}
static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
struct ib_ucontext *context,
bool device_removed)
{
context->closing = 1;
uverbs_cleanup_ucontext(context, device_removed);
put_pid(context->tgid);
ib_rdmacg_uncharge(&context->cg_obj, context->device,
RDMACG_RESOURCE_HCA_HANDLE);
return context->device->dealloc_ucontext(context);
}
static void ib_uverbs_comp_dev(struct ib_uverbs_device *dev)
{
complete(&dev->comp);
@ -246,6 +253,8 @@ void ib_uverbs_release_file(struct kref *ref)
struct ib_device *ib_dev;
int srcu_key;
release_ufile_idr_uobject(file);
srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
ib_dev = srcu_dereference(file->device->ib_dev,
&file->device->disassociate_srcu);
@ -338,7 +347,7 @@ static ssize_t ib_uverbs_comp_event_read(struct file *filp, char __user *buf,
filp->private_data;
return ib_uverbs_event_read(&comp_ev_file->ev_queue,
comp_ev_file->uobj_file.ufile, filp,
comp_ev_file->uobj.ufile, filp,
buf, count, pos,
sizeof(struct ib_uverbs_comp_event_desc));
}
@ -420,7 +429,9 @@ static int ib_uverbs_async_event_close(struct inode *inode, struct file *filp)
static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
{
struct ib_uverbs_completion_event_file *file = filp->private_data;
struct ib_uobject *uobj = filp->private_data;
struct ib_uverbs_completion_event_file *file = container_of(
uobj, struct ib_uverbs_completion_event_file, uobj);
struct ib_uverbs_event *entry, *tmp;
spin_lock_irq(&file->ev_queue.lock);
@ -528,7 +539,7 @@ void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr)
struct ib_ucq_object *uobj = container_of(event->element.cq->uobject,
struct ib_ucq_object, uobject);
ib_uverbs_async_handler(uobj->uverbs_file, uobj->uobject.user_handle,
ib_uverbs_async_handler(uobj->uobject.ufile, uobj->uobject.user_handle,
event->event, &uobj->async_list,
&uobj->async_events_reported);
}
@ -637,13 +648,13 @@ err_put_refs:
return filp;
}
static bool verify_command_mask(struct ib_device *ib_dev,
u32 command, bool extended)
static bool verify_command_mask(struct ib_uverbs_file *ufile, u32 command,
bool extended)
{
if (!extended)
return ib_dev->uverbs_cmd_mask & BIT_ULL(command);
return ufile->uverbs_cmd_mask & BIT_ULL(command);
return ib_dev->uverbs_ex_cmd_mask & BIT_ULL(command);
return ufile->uverbs_ex_cmd_mask & BIT_ULL(command);
}
static bool verify_command_idx(u32 command, bool extended)
@ -713,7 +724,6 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
{
struct ib_uverbs_file *file = filp->private_data;
struct ib_uverbs_ex_cmd_hdr ex_hdr;
struct ib_device *ib_dev;
struct ib_uverbs_cmd_hdr hdr;
bool extended;
int srcu_key;
@ -748,24 +758,8 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
return ret;
srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
ib_dev = srcu_dereference(file->device->ib_dev,
&file->device->disassociate_srcu);
if (!ib_dev) {
ret = -EIO;
goto out;
}
/*
* Must be after the ib_dev check, as once the RCU clears ib_dev ==
* NULL means ucontext == NULL
*/
if (!file->ucontext &&
(command != IB_USER_VERBS_CMD_GET_CONTEXT || extended)) {
ret = -EINVAL;
goto out;
}
if (!verify_command_mask(ib_dev, command, extended)) {
if (!verify_command_mask(file, command, extended)) {
ret = -EOPNOTSUPP;
goto out;
}
@ -773,7 +767,7 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
buf += sizeof(hdr);
if (!extended) {
ret = uverbs_cmd_table[command](file, ib_dev, buf,
ret = uverbs_cmd_table[command](file, buf,
hdr.in_words * 4,
hdr.out_words * 4);
} else {
@ -792,7 +786,7 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
ex_hdr.provider_in_words * 8,
ex_hdr.provider_out_words * 8);
ret = uverbs_ex_cmd_table[command](file, ib_dev, &ucore, &uhw);
ret = uverbs_ex_cmd_table[command](file, &ucore, &uhw);
ret = (ret) ? : count;
}
@ -804,22 +798,18 @@ out:
static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
{
struct ib_uverbs_file *file = filp->private_data;
struct ib_device *ib_dev;
struct ib_ucontext *ucontext;
int ret = 0;
int srcu_key;
srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
ib_dev = srcu_dereference(file->device->ib_dev,
&file->device->disassociate_srcu);
if (!ib_dev) {
ret = -EIO;
ucontext = ib_uverbs_get_ucontext(file);
if (IS_ERR(ucontext)) {
ret = PTR_ERR(ucontext);
goto out;
}
if (!file->ucontext)
ret = -ENODEV;
else
ret = ib_dev->mmap(file->ucontext, vma);
ret = ucontext->device->mmap(ucontext, vma);
out:
srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
return ret;
@ -879,13 +869,12 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
}
file->device = dev;
spin_lock_init(&file->idr_lock);
idr_init(&file->idr);
file->ucontext = NULL;
file->async_file = NULL;
kref_init(&file->ref);
mutex_init(&file->mutex);
mutex_init(&file->cleanup_mutex);
mutex_init(&file->ucontext_lock);
spin_lock_init(&file->uobjects_lock);
INIT_LIST_HEAD(&file->uobjects);
init_rwsem(&file->hw_destroy_rwsem);
filp->private_data = file;
kobject_get(&dev->kobj);
@ -893,6 +882,11 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
mutex_unlock(&dev->lists_mutex);
srcu_read_unlock(&dev->disassociate_srcu, srcu_key);
file->uverbs_cmd_mask = ib_dev->uverbs_cmd_mask;
file->uverbs_ex_cmd_mask = ib_dev->uverbs_ex_cmd_mask;
setup_ufile_idr_uobject(file);
return nonseekable_open(inode, filp);
err_module:
@ -911,13 +905,7 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
{
struct ib_uverbs_file *file = filp->private_data;
mutex_lock(&file->cleanup_mutex);
if (file->ucontext) {
ib_uverbs_cleanup_ucontext(file, file->ucontext, false);
file->ucontext = NULL;
}
mutex_unlock(&file->cleanup_mutex);
idr_destroy(&file->idr);
uverbs_destroy_ufile_hw(file, RDMA_REMOVE_CLOSE);
mutex_lock(&file->device->lists_mutex);
if (!file->is_closed) {
@ -1006,6 +994,19 @@ static DEVICE_ATTR(abi_version, S_IRUGO, show_dev_abi_version, NULL);
static CLASS_ATTR_STRING(abi_version, S_IRUGO,
__stringify(IB_USER_VERBS_ABI_VERSION));
static int ib_uverbs_create_uapi(struct ib_device *device,
struct ib_uverbs_device *uverbs_dev)
{
struct uverbs_api *uapi;
uapi = uverbs_alloc_api(device->driver_specs, device->driver_id);
if (IS_ERR(uapi))
return PTR_ERR(uapi);
uverbs_dev->uapi = uapi;
return 0;
}
static void ib_uverbs_add_one(struct ib_device *device)
{
int devnum;
@ -1048,6 +1049,9 @@ static void ib_uverbs_add_one(struct ib_device *device)
rcu_assign_pointer(uverbs_dev->ib_dev, device);
uverbs_dev->num_comp_vectors = device->num_comp_vectors;
if (ib_uverbs_create_uapi(device, uverbs_dev))
goto err;
cdev_init(&uverbs_dev->cdev, NULL);
uverbs_dev->cdev.owner = THIS_MODULE;
uverbs_dev->cdev.ops = device->mmap ? &uverbs_mmap_fops : &uverbs_fops;
@ -1067,18 +1071,6 @@ static void ib_uverbs_add_one(struct ib_device *device)
if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
goto err_class;
if (!device->specs_root) {
const struct uverbs_object_tree_def *default_root[] = {
uverbs_default_get_objects()};
uverbs_dev->specs_root = uverbs_alloc_spec_tree(1,
default_root);
if (IS_ERR(uverbs_dev->specs_root))
goto err_class;
device->specs_root = uverbs_dev->specs_root;
}
ib_set_client_data(device, &uverbs_client, uverbs_dev);
return;
@ -1098,44 +1090,6 @@ err:
return;
}
static void ib_uverbs_disassociate_ucontext(struct ib_ucontext *ibcontext)
{
struct ib_device *ib_dev = ibcontext->device;
struct task_struct *owning_process = NULL;
struct mm_struct *owning_mm = NULL;
owning_process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
if (!owning_process)
return;
owning_mm = get_task_mm(owning_process);
if (!owning_mm) {
pr_info("no mm, disassociate ucontext is pending task termination\n");
while (1) {
put_task_struct(owning_process);
usleep_range(1000, 2000);
owning_process = get_pid_task(ibcontext->tgid,
PIDTYPE_PID);
if (!owning_process ||
owning_process->state == TASK_DEAD) {
pr_info("disassociate ucontext done, task was terminated\n");
/* in case task was dead need to release the
* task struct.
*/
if (owning_process)
put_task_struct(owning_process);
return;
}
}
}
down_write(&owning_mm->mmap_sem);
ib_dev->disassociate_ucontext(ibcontext);
up_write(&owning_mm->mmap_sem);
mmput(owning_mm);
put_task_struct(owning_process);
}
static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
struct ib_device *ib_dev)
{
@ -1144,46 +1098,31 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
struct ib_event event;
/* Pending running commands to terminate */
synchronize_srcu(&uverbs_dev->disassociate_srcu);
uverbs_disassociate_api_pre(uverbs_dev);
event.event = IB_EVENT_DEVICE_FATAL;
event.element.port_num = 0;
event.device = ib_dev;
mutex_lock(&uverbs_dev->lists_mutex);
while (!list_empty(&uverbs_dev->uverbs_file_list)) {
struct ib_ucontext *ucontext;
file = list_first_entry(&uverbs_dev->uverbs_file_list,
struct ib_uverbs_file, list);
file->is_closed = 1;
list_del(&file->list);
kref_get(&file->ref);
/* We must release the mutex before going ahead and calling
* uverbs_cleanup_ufile, as it might end up indirectly calling
* uverbs_close, for example due to freeing the resources (e.g
* mmput).
*/
mutex_unlock(&uverbs_dev->lists_mutex);
mutex_lock(&file->cleanup_mutex);
ucontext = file->ucontext;
file->ucontext = NULL;
mutex_unlock(&file->cleanup_mutex);
/* At this point ib_uverbs_close cannot be running
* ib_uverbs_cleanup_ucontext
*/
if (ucontext) {
/* We must release the mutex before going ahead and
* calling disassociate_ucontext. disassociate_ucontext
* might end up indirectly calling uverbs_close,
* for example due to freeing the resources
* (e.g mmput).
*/
ib_uverbs_event_handler(&file->event_handler, &event);
ib_uverbs_disassociate_ucontext(ucontext);
mutex_lock(&file->cleanup_mutex);
ib_uverbs_cleanup_ucontext(file, ucontext, true);
mutex_unlock(&file->cleanup_mutex);
}
uverbs_destroy_ufile_hw(file, RDMA_REMOVE_DRIVER_REMOVE);
kref_put(&file->ref, ib_uverbs_release_file);
mutex_lock(&uverbs_dev->lists_mutex);
kref_put(&file->ref, ib_uverbs_release_file);
}
while (!list_empty(&uverbs_dev->uverbs_events_file_list)) {
@ -1205,6 +1144,8 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
kill_fasync(&event_file->ev_queue.async_queue, SIGIO, POLL_IN);
}
mutex_unlock(&uverbs_dev->lists_mutex);
uverbs_disassociate_api(uverbs_dev->uapi);
}
static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
@ -1232,7 +1173,6 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
* cdev was deleted, however active clients can still issue
* commands and close their open files.
*/
rcu_assign_pointer(uverbs_dev->ib_dev, NULL);
ib_uverbs_free_hw_resources(uverbs_dev, device);
wait_clients = 0;
}
@ -1241,10 +1181,6 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
ib_uverbs_comp_dev(uverbs_dev);
if (wait_clients)
wait_for_completion(&uverbs_dev->comp);
if (uverbs_dev->specs_root) {
uverbs_free_spec_tree(uverbs_dev->specs_root);
device->specs_root = NULL;
}
kobject_put(&uverbs_dev->kobj);
}

View File

@ -211,7 +211,5 @@ void ib_copy_path_rec_from_user(struct sa_path_rec *dst,
/* TODO: No need to set this */
sa_path_set_dmac_zero(dst);
sa_path_set_ndev(dst, NULL);
sa_path_set_ifindex(dst, 0);
}
EXPORT_SYMBOL(ib_copy_path_rec_from_user);

View File

@ -48,14 +48,18 @@ static int uverbs_free_ah(struct ib_uobject *uobject,
static int uverbs_free_flow(struct ib_uobject *uobject,
enum rdma_remove_reason why)
{
int ret;
struct ib_flow *flow = (struct ib_flow *)uobject->object;
struct ib_uflow_object *uflow =
container_of(uobject, struct ib_uflow_object, uobject);
struct ib_qp *qp = flow->qp;
int ret;
ret = ib_destroy_flow(flow);
if (!ret)
ret = flow->device->destroy_flow(flow);
if (!ret) {
if (qp)
atomic_dec(&qp->usecnt);
ib_uverbs_flow_resources_free(uflow->resources);
}
return ret;
}
@ -74,6 +78,13 @@ static int uverbs_free_qp(struct ib_uobject *uobject,
container_of(uobject, struct ib_uqp_object, uevent.uobject);
int ret;
/*
* If this is a user triggered destroy then do not allow destruction
* until the user cleans up all the mcast bindings. Unlike in other
* places we forcibly clean up the mcast attachments for !DESTROY
* because the mcast attaches are not ubojects and will not be
* destroyed by anything else during cleanup processing.
*/
if (why == RDMA_REMOVE_DESTROY) {
if (!list_empty(&uqp->mcast_list))
return -EBUSY;
@ -82,7 +93,7 @@ static int uverbs_free_qp(struct ib_uobject *uobject,
}
ret = ib_destroy_qp(qp);
if (ret && why == RDMA_REMOVE_DESTROY)
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
if (uqp->uxrcd)
@ -100,7 +111,9 @@ static int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
int ret;
ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
if (!ret || why != RDMA_REMOVE_DESTROY)
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
kfree(ind_tbl);
return ret;
}
@ -114,7 +127,9 @@ static int uverbs_free_wq(struct ib_uobject *uobject,
int ret;
ret = ib_destroy_wq(wq);
if (!ret || why != RDMA_REMOVE_DESTROY)
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
ib_uverbs_release_uevent(uobject->context->ufile, &uwq->uevent);
return ret;
}
@ -129,8 +144,7 @@ static int uverbs_free_srq(struct ib_uobject *uobject,
int ret;
ret = ib_destroy_srq(srq);
if (ret && why == RDMA_REMOVE_DESTROY)
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
if (srq_type == IB_SRQT_XRC) {
@ -152,12 +166,12 @@ static int uverbs_free_xrcd(struct ib_uobject *uobject,
container_of(uobject, struct ib_uxrcd_object, uobject);
int ret;
ret = ib_destroy_usecnt(&uxrcd->refcnt, why, uobject);
if (ret)
return ret;
mutex_lock(&uobject->context->ufile->device->xrcd_tree_mutex);
if (why == RDMA_REMOVE_DESTROY && atomic_read(&uxrcd->refcnt))
ret = -EBUSY;
else
ret = ib_uverbs_dealloc_xrcd(uobject->context->ufile->device,
xrcd, why);
ret = ib_uverbs_dealloc_xrcd(uobject, xrcd, why);
mutex_unlock(&uobject->context->ufile->device->xrcd_tree_mutex);
return ret;
@ -167,20 +181,22 @@ static int uverbs_free_pd(struct ib_uobject *uobject,
enum rdma_remove_reason why)
{
struct ib_pd *pd = uobject->object;
int ret;
if (why == RDMA_REMOVE_DESTROY && atomic_read(&pd->usecnt))
return -EBUSY;
ret = ib_destroy_usecnt(&pd->usecnt, why, uobject);
if (ret)
return ret;
ib_dealloc_pd((struct ib_pd *)uobject->object);
return 0;
}
static int uverbs_hot_unplug_completion_event_file(struct ib_uobject_file *uobj_file,
static int uverbs_hot_unplug_completion_event_file(struct ib_uobject *uobj,
enum rdma_remove_reason why)
{
struct ib_uverbs_completion_event_file *comp_event_file =
container_of(uobj_file, struct ib_uverbs_completion_event_file,
uobj_file);
container_of(uobj, struct ib_uverbs_completion_event_file,
uobj);
struct ib_uverbs_event_queue *event_queue = &comp_event_file->ev_queue;
spin_lock_irq(&event_queue->lock);
@ -194,100 +210,59 @@ static int uverbs_hot_unplug_completion_event_file(struct ib_uobject_file *uobj_
return 0;
};
int uverbs_destroy_def_handler(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
int uverbs_destroy_def_handler(struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
{
return 0;
}
EXPORT_SYMBOL(uverbs_destroy_def_handler);
/*
* This spec is used in order to pass information to the hardware driver in a
* legacy way. Every verb that could get driver specific data should get this
* spec.
*/
const struct uverbs_attr_def uverbs_uhw_compat_in =
UVERBS_ATTR_PTR_IN_SZ(UVERBS_ATTR_UHW_IN, UVERBS_ATTR_SIZE(0, USHRT_MAX),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO));
const struct uverbs_attr_def uverbs_uhw_compat_out =
UVERBS_ATTR_PTR_OUT_SZ(UVERBS_ATTR_UHW_OUT, UVERBS_ATTR_SIZE(0, USHRT_MAX),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO));
void create_udata(struct uverbs_attr_bundle *ctx, struct ib_udata *udata)
{
/*
* This is for ease of conversion. The purpose is to convert all drivers
* to use uverbs_attr_bundle instead of ib_udata.
* Assume attr == 0 is input and attr == 1 is output.
*/
const struct uverbs_attr *uhw_in =
uverbs_attr_get(ctx, UVERBS_ATTR_UHW_IN);
const struct uverbs_attr *uhw_out =
uverbs_attr_get(ctx, UVERBS_ATTR_UHW_OUT);
if (!IS_ERR(uhw_in)) {
udata->inlen = uhw_in->ptr_attr.len;
if (uverbs_attr_ptr_is_inline(uhw_in))
udata->inbuf = &uhw_in->uattr->data;
else
udata->inbuf = u64_to_user_ptr(uhw_in->ptr_attr.data);
} else {
udata->inbuf = NULL;
udata->inlen = 0;
}
if (!IS_ERR(uhw_out)) {
udata->outbuf = u64_to_user_ptr(uhw_out->ptr_attr.data);
udata->outlen = uhw_out->ptr_attr.len;
} else {
udata->outbuf = NULL;
udata->outlen = 0;
}
}
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_COMP_CHANNEL,
&UVERBS_TYPE_ALLOC_FD(0,
sizeof(struct ib_uverbs_completion_event_file),
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_COMP_CHANNEL,
UVERBS_TYPE_ALLOC_FD(sizeof(struct ib_uverbs_completion_event_file),
uverbs_hot_unplug_completion_event_file,
&uverbs_event_fops,
"[infinibandevent]", O_RDONLY));
"[infinibandevent]",
O_RDONLY));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_QP,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0,
uverbs_free_qp));
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_QP,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), uverbs_free_qp));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_MW,
&UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_mw));
UVERBS_TYPE_ALLOC_IDR(uverbs_free_mw));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_SRQ,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0,
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_SRQ,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object),
uverbs_free_srq));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_AH,
&UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_ah));
UVERBS_TYPE_ALLOC_IDR(uverbs_free_ah));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_FLOW,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uflow_object),
0, uverbs_free_flow));
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_FLOW,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uflow_object),
uverbs_free_flow));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_WQ,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), 0,
uverbs_free_wq));
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_WQ,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), uverbs_free_wq));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_RWQ_IND_TBL,
&UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_rwq_ind_tbl));
UVERBS_TYPE_ALLOC_IDR(uverbs_free_rwq_ind_tbl));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_XRCD,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object), 0,
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_XRCD,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object),
uverbs_free_xrcd));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_PD,
/* 2 is used in order to free the PD after MRs */
&UVERBS_TYPE_ALLOC_IDR(2, uverbs_free_pd));
UVERBS_TYPE_ALLOC_IDR(uverbs_free_pd));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_DEVICE, NULL);
DECLARE_UVERBS_GLOBAL_METHODS(UVERBS_OBJECT_DEVICE);
static DECLARE_UVERBS_OBJECT_TREE(uverbs_default_objects,
DECLARE_UVERBS_OBJECT_TREE(uverbs_default_objects,
&UVERBS_OBJECT(UVERBS_OBJECT_DEVICE),
&UVERBS_OBJECT(UVERBS_OBJECT_PD),
&UVERBS_OBJECT(UVERBS_OBJECT_MR),
@ -309,4 +284,3 @@ const struct uverbs_object_tree_def *uverbs_default_get_objects(void)
{
return &uverbs_default_objects;
}
EXPORT_SYMBOL_GPL(uverbs_default_get_objects);

View File

@ -38,20 +38,22 @@ static int uverbs_free_counters(struct ib_uobject *uobject,
enum rdma_remove_reason why)
{
struct ib_counters *counters = uobject->object;
int ret;
if (why == RDMA_REMOVE_DESTROY &&
atomic_read(&counters->usecnt))
return -EBUSY;
ret = ib_destroy_usecnt(&counters->usecnt, why, uobject);
if (ret)
return ret;
return counters->device->destroy_counters(counters);
}
static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_CREATE)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_CREATE)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj = uverbs_attr_get_uobject(
attrs, UVERBS_ATTR_CREATE_COUNTERS_HANDLE);
struct ib_device *ib_dev = uobj->context->device;
struct ib_counters *counters;
struct ib_uobject *uobj;
int ret;
/*
@ -62,7 +64,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_CREATE)(struct ib_device *ib_de
if (!ib_dev->create_counters)
return -EOPNOTSUPP;
uobj = uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_COUNTERS_HANDLE);
counters = ib_dev->create_counters(ib_dev, attrs);
if (IS_ERR(counters)) {
ret = PTR_ERR(counters);
@ -80,9 +81,8 @@ err_create_counters:
return ret;
}
static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_READ)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_READ)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_counters_read_attr read_attr = {};
const struct uverbs_attr *uattr;
@ -90,68 +90,62 @@ static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_READ)(struct ib_device *ib_dev,
uverbs_attr_get_obj(attrs, UVERBS_ATTR_READ_COUNTERS_HANDLE);
int ret;
if (!ib_dev->read_counters)
if (!counters->device->read_counters)
return -EOPNOTSUPP;
if (!atomic_read(&counters->usecnt))
return -EINVAL;
ret = uverbs_copy_from(&read_attr.flags, attrs,
UVERBS_ATTR_READ_COUNTERS_FLAGS);
ret = uverbs_get_flags32(&read_attr.flags, attrs,
UVERBS_ATTR_READ_COUNTERS_FLAGS,
IB_UVERBS_READ_COUNTERS_PREFER_CACHED);
if (ret)
return ret;
uattr = uverbs_attr_get(attrs, UVERBS_ATTR_READ_COUNTERS_BUFF);
read_attr.ncounters = uattr->ptr_attr.len / sizeof(u64);
read_attr.counters_buff = kcalloc(read_attr.ncounters,
sizeof(u64), GFP_KERNEL);
if (!read_attr.counters_buff)
return -ENOMEM;
read_attr.counters_buff = uverbs_zalloc(
attrs, array_size(read_attr.ncounters, sizeof(u64)));
if (IS_ERR(read_attr.counters_buff))
return PTR_ERR(read_attr.counters_buff);
ret = ib_dev->read_counters(counters,
&read_attr,
attrs);
ret = counters->device->read_counters(counters, &read_attr, attrs);
if (ret)
goto err_read;
return ret;
ret = uverbs_copy_to(attrs, UVERBS_ATTR_READ_COUNTERS_BUFF,
return uverbs_copy_to(attrs, UVERBS_ATTR_READ_COUNTERS_BUFF,
read_attr.counters_buff,
read_attr.ncounters * sizeof(u64));
err_read:
kfree(read_attr.counters_buff);
return ret;
}
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_COUNTERS_CREATE,
&UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_COUNTERS_CREATE,
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
UVERBS_OBJECT_COUNTERS,
UVERBS_ACCESS_NEW,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
static DECLARE_UVERBS_NAMED_METHOD_WITH_HANDLER(UVERBS_METHOD_COUNTERS_DESTROY,
uverbs_destroy_def_handler,
&UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_COUNTERS_HANDLE,
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
UVERBS_METHOD_COUNTERS_DESTROY,
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_COUNTERS_HANDLE,
UVERBS_OBJECT_COUNTERS,
UVERBS_ACCESS_DESTROY,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
#define MAX_COUNTERS_BUFF_SIZE USHRT_MAX
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_COUNTERS_READ,
&UVERBS_ATTR_IDR(UVERBS_ATTR_READ_COUNTERS_HANDLE,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_COUNTERS_READ,
UVERBS_ATTR_IDR(UVERBS_ATTR_READ_COUNTERS_HANDLE,
UVERBS_OBJECT_COUNTERS,
UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_READ_COUNTERS_BUFF,
UVERBS_ATTR_SIZE(0, MAX_COUNTERS_BUFF_SIZE),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_READ_COUNTERS_FLAGS,
UVERBS_ATTR_TYPE(__u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY),
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_READ_COUNTERS_BUFF,
UVERBS_ATTR_MIN_SIZE(0),
UA_MANDATORY),
UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_READ_COUNTERS_FLAGS,
enum ib_uverbs_read_counters_flags));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_COUNTERS,
&UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_counters),
UVERBS_TYPE_ALLOC_IDR(uverbs_free_counters),
&UVERBS_METHOD(UVERBS_METHOD_COUNTERS_CREATE),
&UVERBS_METHOD(UVERBS_METHOD_COUNTERS_DESTROY),
&UVERBS_METHOD(UVERBS_METHOD_COUNTERS_READ));

View File

@ -44,21 +44,26 @@ static int uverbs_free_cq(struct ib_uobject *uobject,
int ret;
ret = ib_destroy_cq(cq);
if (!ret || why != RDMA_REMOVE_DESTROY)
ib_uverbs_release_ucq(uobject->context->ufile, ev_queue ?
container_of(ev_queue,
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
ib_uverbs_release_ucq(
uobject->context->ufile,
ev_queue ? container_of(ev_queue,
struct ib_uverbs_completion_event_file,
ev_queue) : NULL,
ev_queue) :
NULL,
ucq);
return ret;
}
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_ucontext *ucontext = file->ucontext;
struct ib_ucq_object *obj;
struct ib_ucq_object *obj = container_of(
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_CQ_HANDLE),
typeof(*obj), uobject);
struct ib_device *ib_dev = obj->uobject.context->device;
struct ib_udata uhw;
int ret;
u64 user_handle;
@ -67,7 +72,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
struct ib_uverbs_completion_event_file *ev_file = NULL;
struct ib_uobject *ev_file_uobj;
if (!(ib_dev->uverbs_cmd_mask & 1ULL << IB_USER_VERBS_CMD_CREATE_CQ))
if (!ib_dev->create_cq || !ib_dev->destroy_cq)
return -EOPNOTSUPP;
ret = uverbs_copy_from(&attr.comp_vector, attrs,
@ -81,28 +86,26 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
if (ret)
return ret;
/* Optional param, if it doesn't exist, we get -ENOENT and skip it */
if (IS_UVERBS_COPY_ERR(uverbs_copy_from(&attr.flags, attrs,
UVERBS_ATTR_CREATE_CQ_FLAGS)))
return -EFAULT;
ret = uverbs_get_flags32(&attr.flags, attrs,
UVERBS_ATTR_CREATE_CQ_FLAGS,
IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION |
IB_UVERBS_CQ_FLAGS_IGNORE_OVERRUN);
if (ret)
return ret;
ev_file_uobj = uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL);
if (!IS_ERR(ev_file_uobj)) {
ev_file = container_of(ev_file_uobj,
struct ib_uverbs_completion_event_file,
uobj_file.uobj);
uobj);
uverbs_uobject_get(ev_file_uobj);
}
if (attr.comp_vector >= ucontext->ufile->device->num_comp_vectors) {
if (attr.comp_vector >= file->device->num_comp_vectors) {
ret = -EINVAL;
goto err_event_file;
}
obj = container_of(uverbs_attr_get_uobject(attrs,
UVERBS_ATTR_CREATE_CQ_HANDLE),
typeof(*obj), uobject);
obj->uverbs_file = ucontext->ufile;
obj->comp_events_reported = 0;
obj->async_events_reported = 0;
INIT_LIST_HEAD(&obj->comp_list);
@ -111,7 +114,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
/* Temporary, only until drivers get the new uverbs_attr_bundle */
create_udata(attrs, &uhw);
cq = ib_dev->create_cq(ib_dev, &attr, ucontext, &uhw);
cq = ib_dev->create_cq(ib_dev, &attr, obj->uobject.context, &uhw);
if (IS_ERR(cq)) {
ret = PTR_ERR(cq);
goto err_event_file;
@ -143,69 +146,64 @@ err_event_file:
return ret;
};
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_CQ_CREATE,
&UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_CQ_HANDLE, UVERBS_OBJECT_CQ,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_CQ_CREATE,
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_CQ_HANDLE,
UVERBS_OBJECT_CQ,
UVERBS_ACCESS_NEW,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_CQE,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_CQE,
UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_USER_HANDLE,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_USER_HANDLE,
UVERBS_ATTR_TYPE(u64),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL,
UA_MANDATORY),
UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL,
UVERBS_OBJECT_COMP_CHANNEL,
UVERBS_ACCESS_READ),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_COMP_VECTOR, UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_FLAGS, UVERBS_ATTR_TYPE(u32)),
&UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_CQ_RESP_CQE, UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&uverbs_uhw_compat_in, &uverbs_uhw_compat_out);
UVERBS_ACCESS_READ,
UA_OPTIONAL),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_COMP_VECTOR,
UVERBS_ATTR_TYPE(u32),
UA_MANDATORY),
UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_CREATE_CQ_FLAGS,
enum ib_uverbs_ex_create_cq_flags),
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_CQ_RESP_CQE,
UVERBS_ATTR_TYPE(u32),
UA_MANDATORY),
UVERBS_ATTR_UHW());
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_DESTROY)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_DESTROY)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj =
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_DESTROY_CQ_HANDLE);
struct ib_uverbs_destroy_cq_resp resp;
struct ib_ucq_object *obj;
int ret;
if (IS_ERR(uobj))
return PTR_ERR(uobj);
obj = container_of(uobj, struct ib_ucq_object, uobject);
if (!(ib_dev->uverbs_cmd_mask & 1ULL << IB_USER_VERBS_CMD_DESTROY_CQ))
return -EOPNOTSUPP;
ret = rdma_explicit_destroy(uobj);
if (ret)
return ret;
resp.comp_events_reported = obj->comp_events_reported;
resp.async_events_reported = obj->async_events_reported;
struct ib_ucq_object *obj =
container_of(uobj, struct ib_ucq_object, uobject);
struct ib_uverbs_destroy_cq_resp resp = {
.comp_events_reported = obj->comp_events_reported,
.async_events_reported = obj->async_events_reported
};
return uverbs_copy_to(attrs, UVERBS_ATTR_DESTROY_CQ_RESP, &resp,
sizeof(resp));
}
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_CQ_DESTROY,
&UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_CQ_HANDLE, UVERBS_OBJECT_CQ,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_CQ_DESTROY,
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_CQ_HANDLE,
UVERBS_OBJECT_CQ,
UVERBS_ACCESS_DESTROY,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_DESTROY_CQ_RESP,
UA_MANDATORY),
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_DESTROY_CQ_RESP,
UVERBS_ATTR_TYPE(struct ib_uverbs_destroy_cq_resp),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_CQ,
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), uverbs_free_cq),
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_CQ,
&UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0,
uverbs_free_cq),
#if IS_ENABLED(CONFIG_INFINIBAND_EXP_LEGACY_VERBS_NEW_UAPI)
&UVERBS_METHOD(UVERBS_METHOD_CQ_CREATE),
&UVERBS_METHOD(UVERBS_METHOD_CQ_DESTROY)
#endif
);
);

View File

@ -37,20 +37,24 @@ static int uverbs_free_dm(struct ib_uobject *uobject,
enum rdma_remove_reason why)
{
struct ib_dm *dm = uobject->object;
int ret;
if (why == RDMA_REMOVE_DESTROY && atomic_read(&dm->usecnt))
return -EBUSY;
ret = ib_destroy_usecnt(&dm->usecnt, why, uobject);
if (ret)
return ret;
return dm->device->dealloc_dm(dm);
}
static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
static int
UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
{
struct ib_ucontext *ucontext = file->ucontext;
struct ib_dm_alloc_attr attr = {};
struct ib_uobject *uobj;
struct ib_uobject *uobj =
uverbs_attr_get(attrs, UVERBS_ATTR_ALLOC_DM_HANDLE)
->obj_attr.uobject;
struct ib_device *ib_dev = uobj->context->device;
struct ib_dm *dm;
int ret;
@ -67,9 +71,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(struct ib_device *ib_dev,
if (ret)
return ret;
uobj = uverbs_attr_get(attrs, UVERBS_ATTR_ALLOC_DM_HANDLE)->obj_attr.uobject;
dm = ib_dev->alloc_dm(ib_dev, ucontext, &attr, attrs);
dm = ib_dev->alloc_dm(ib_dev, uobj->context, &attr, attrs);
if (IS_ERR(dm))
return PTR_ERR(dm);
@ -83,26 +85,27 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(struct ib_device *ib_dev,
return 0;
}
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_DM_ALLOC,
&UVERBS_ATTR_IDR(UVERBS_ATTR_ALLOC_DM_HANDLE, UVERBS_OBJECT_DM,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_DM_ALLOC,
UVERBS_ATTR_IDR(UVERBS_ATTR_ALLOC_DM_HANDLE,
UVERBS_OBJECT_DM,
UVERBS_ACCESS_NEW,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_ALLOC_DM_LENGTH,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_ALLOC_DM_LENGTH,
UVERBS_ATTR_TYPE(u64),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_ALLOC_DM_ALIGNMENT,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_ALLOC_DM_ALIGNMENT,
UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
static DECLARE_UVERBS_NAMED_METHOD_WITH_HANDLER(UVERBS_METHOD_DM_FREE,
uverbs_destroy_def_handler,
&UVERBS_ATTR_IDR(UVERBS_ATTR_FREE_DM_HANDLE,
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
UVERBS_METHOD_DM_FREE,
UVERBS_ATTR_IDR(UVERBS_ATTR_FREE_DM_HANDLE,
UVERBS_OBJECT_DM,
UVERBS_ACCESS_DESTROY,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_DM,
/* 1 is used in order to free the DM after MRs */
&UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_dm),
UVERBS_TYPE_ALLOC_IDR(uverbs_free_dm),
&UVERBS_METHOD(UVERBS_METHOD_DM_ALLOC),
&UVERBS_METHOD(UVERBS_METHOD_DM_FREE));

View File

@ -37,10 +37,11 @@ static int uverbs_free_flow_action(struct ib_uobject *uobject,
enum rdma_remove_reason why)
{
struct ib_flow_action *action = uobject->object;
int ret;
if (why == RDMA_REMOVE_DESTROY &&
atomic_read(&action->usecnt))
return -EBUSY;
ret = ib_destroy_usecnt(&action->usecnt, why, uobject);
if (ret)
return ret;
return action->device->destroy_flow_action(action);
}
@ -303,12 +304,13 @@ static int parse_flow_action_esp(struct ib_device *ib_dev,
return 0;
}
static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj = uverbs_attr_get_uobject(
attrs, UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE);
struct ib_device *ib_dev = uobj->context->device;
int ret;
struct ib_uobject *uobj;
struct ib_flow_action *action;
struct ib_flow_action_esp_attr esp_attr = {};
@ -320,7 +322,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(struct ib_device
return ret;
/* No need to check as this attribute is marked as MANDATORY */
uobj = uverbs_attr_get_uobject(attrs, UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE);
action = ib_dev->create_flow_action_esp(ib_dev, &esp_attr.hdr, attrs);
if (IS_ERR(action))
return PTR_ERR(action);
@ -334,102 +335,109 @@ static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(struct ib_device
return 0;
}
static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj = uverbs_attr_get_uobject(
attrs, UVERBS_ATTR_MODIFY_FLOW_ACTION_ESP_HANDLE);
struct ib_flow_action *action = uobj->object;
int ret;
struct ib_uobject *uobj;
struct ib_flow_action *action;
struct ib_flow_action_esp_attr esp_attr = {};
if (!ib_dev->modify_flow_action_esp)
if (!action->device->modify_flow_action_esp)
return -EOPNOTSUPP;
ret = parse_flow_action_esp(ib_dev, file, attrs, &esp_attr, true);
ret = parse_flow_action_esp(action->device, file, attrs, &esp_attr,
true);
if (ret)
return ret;
uobj = uverbs_attr_get_uobject(attrs, UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE);
action = uobj->object;
if (action->type != IB_FLOW_ACTION_ESP)
return -EINVAL;
return ib_dev->modify_flow_action_esp(action,
&esp_attr.hdr,
return action->device->modify_flow_action_esp(action, &esp_attr.hdr,
attrs);
}
static const struct uverbs_attr_spec uverbs_flow_action_esp_keymat[] = {
[IB_UVERBS_FLOW_ACTION_ESP_KEYMAT_AES_GCM] = {
{ .ptr = {
.type = UVERBS_ATTR_TYPE_PTR_IN,
UVERBS_ATTR_TYPE(struct ib_uverbs_flow_action_esp_keymat_aes_gcm),
.flags = UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO,
} },
UVERBS_ATTR_STRUCT(
struct ib_uverbs_flow_action_esp_keymat_aes_gcm,
aes_key),
},
};
static const struct uverbs_attr_spec uverbs_flow_action_esp_replay[] = {
[IB_UVERBS_FLOW_ACTION_ESP_REPLAY_NONE] = {
{ .ptr = {
.type = UVERBS_ATTR_TYPE_PTR_IN,
/* No need to specify any data */
.len = 0,
} }
UVERBS_ATTR_NO_DATA(),
},
[IB_UVERBS_FLOW_ACTION_ESP_REPLAY_BMP] = {
{ .ptr = {
.type = UVERBS_ATTR_TYPE_PTR_IN,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp_replay_bmp, size),
.flags = UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO,
} }
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp_replay_bmp,
size),
},
};
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE,
&UVERBS_ATTR_IDR(UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE, UVERBS_OBJECT_FLOW_ACTION,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_FLOW_ACTION_ESP_CREATE,
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE,
UVERBS_OBJECT_FLOW_ACTION,
UVERBS_ACCESS_NEW,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp, hard_limit_pkts),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY |
UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ESN, UVERBS_ATTR_TYPE(__u32)),
&UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp,
hard_limit_pkts),
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ESN,
UVERBS_ATTR_TYPE(__u32),
UA_OPTIONAL),
UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT,
uverbs_flow_action_esp_keymat,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY,
uverbs_flow_action_esp_replay),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp_encap, type)));
UA_MANDATORY),
UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY,
uverbs_flow_action_esp_replay,
UA_OPTIONAL),
UVERBS_ATTR_PTR_IN(
UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP,
UVERBS_ATTR_TYPE(struct ib_uverbs_flow_action_esp_encap),
UA_OPTIONAL));
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY,
&UVERBS_ATTR_IDR(UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE, UVERBS_OBJECT_FLOW_ACTION,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY,
UVERBS_ATTR_IDR(UVERBS_ATTR_MODIFY_FLOW_ACTION_ESP_HANDLE,
UVERBS_OBJECT_FLOW_ACTION,
UVERBS_ACCESS_WRITE,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp, hard_limit_pkts),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ESN, UVERBS_ATTR_TYPE(__u32)),
&UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT,
uverbs_flow_action_esp_keymat),
&UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY,
uverbs_flow_action_esp_replay),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp_encap, type)));
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS,
UVERBS_ATTR_STRUCT(struct ib_uverbs_flow_action_esp,
hard_limit_pkts),
UA_OPTIONAL),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_FLOW_ACTION_ESP_ESN,
UVERBS_ATTR_TYPE(__u32),
UA_OPTIONAL),
UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT,
uverbs_flow_action_esp_keymat,
UA_OPTIONAL),
UVERBS_ATTR_ENUM_IN(UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY,
uverbs_flow_action_esp_replay,
UA_OPTIONAL),
UVERBS_ATTR_PTR_IN(
UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP,
UVERBS_ATTR_TYPE(struct ib_uverbs_flow_action_esp_encap),
UA_OPTIONAL));
static DECLARE_UVERBS_NAMED_METHOD_WITH_HANDLER(UVERBS_METHOD_FLOW_ACTION_DESTROY,
uverbs_destroy_def_handler,
&UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_FLOW_ACTION_HANDLE,
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
UVERBS_METHOD_FLOW_ACTION_DESTROY,
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_FLOW_ACTION_HANDLE,
UVERBS_OBJECT_FLOW_ACTION,
UVERBS_ACCESS_DESTROY,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_FLOW_ACTION,
&UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_flow_action),
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_FLOW_ACTION,
UVERBS_TYPE_ALLOC_IDR(uverbs_free_flow_action),
&UVERBS_METHOD(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE),
&UVERBS_METHOD(UVERBS_METHOD_FLOW_ACTION_DESTROY),
&UVERBS_METHOD(UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY));

View File

@ -39,14 +39,18 @@ static int uverbs_free_mr(struct ib_uobject *uobject,
return ib_dereg_mr((struct ib_mr *)uobject->object);
}
static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(struct ib_device *ib_dev,
struct ib_uverbs_file *file,
struct uverbs_attr_bundle *attrs)
static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(
struct ib_uverbs_file *file, struct uverbs_attr_bundle *attrs)
{
struct ib_dm_mr_attr attr = {};
struct ib_uobject *uobj;
struct ib_dm *dm;
struct ib_pd *pd;
struct ib_uobject *uobj =
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_REG_DM_MR_HANDLE);
struct ib_dm *dm =
uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DM_MR_DM_HANDLE);
struct ib_pd *pd =
uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DM_MR_PD_HANDLE);
struct ib_device *ib_dev = pd->device;
struct ib_mr *mr;
int ret;
@ -62,8 +66,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(struct ib_device *ib_dev,
if (ret)
return ret;
ret = uverbs_copy_from(&attr.access_flags, attrs,
UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS);
ret = uverbs_get_flags32(&attr.access_flags, attrs,
UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS,
IB_ACCESS_SUPPORTED);
if (ret)
return ret;
@ -74,12 +79,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(struct ib_device *ib_dev,
if (ret)
return ret;
pd = uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DM_MR_PD_HANDLE);
dm = uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DM_MR_DM_HANDLE);
uobj = uverbs_attr_get(attrs, UVERBS_ATTR_REG_DM_MR_HANDLE)->obj_attr.uobject;
if (attr.offset > dm->length || attr.length > dm->length ||
attr.length > dm->length - attr.offset)
return -EINVAL;
@ -115,33 +114,36 @@ err_dereg:
return ret;
}
static DECLARE_UVERBS_NAMED_METHOD(UVERBS_METHOD_DM_MR_REG,
&UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_HANDLE, UVERBS_OBJECT_MR,
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_DM_MR_REG,
UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_HANDLE,
UVERBS_OBJECT_MR,
UVERBS_ACCESS_NEW,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DM_MR_OFFSET,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DM_MR_OFFSET,
UVERBS_ATTR_TYPE(u64),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DM_MR_LENGTH,
UA_MANDATORY),
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DM_MR_LENGTH,
UVERBS_ATTR_TYPE(u64),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_PD_HANDLE, UVERBS_OBJECT_PD,
UA_MANDATORY),
UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_PD_HANDLE,
UVERBS_OBJECT_PD,
UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS,
UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_DM_HANDLE, UVERBS_OBJECT_DM,
UA_MANDATORY),
UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS,
enum ib_access_flags),
UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DM_MR_DM_HANDLE,
UVERBS_OBJECT_DM,
UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DM_MR_RESP_LKEY,
UA_MANDATORY),
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DM_MR_RESP_LKEY,
UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
&UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DM_MR_RESP_RKEY,
UA_MANDATORY),
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DM_MR_RESP_RKEY,
UVERBS_ATTR_TYPE(u32),
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
UA_MANDATORY));
DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_MR,
/* 1 is used in order to free the MR after all the MWs */
&UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mr),
DECLARE_UVERBS_NAMED_OBJECT(
UVERBS_OBJECT_MR,
UVERBS_TYPE_ALLOC_IDR(uverbs_free_mr),
&UVERBS_METHOD(UVERBS_METHOD_DM_MR_REG));

View File

@ -0,0 +1,346 @@
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
/*
* Copyright (c) 2017, Mellanox Technologies inc. All rights reserved.
*/
#include <rdma/uverbs_ioctl.h>
#include <rdma/rdma_user_ioctl.h>
#include <linux/bitops.h>
#include "rdma_core.h"
#include "uverbs.h"
static void *uapi_add_elm(struct uverbs_api *uapi, u32 key, size_t alloc_size)
{
void *elm;
int rc;
if (key == UVERBS_API_KEY_ERR)
return ERR_PTR(-EOVERFLOW);
elm = kzalloc(alloc_size, GFP_KERNEL);
rc = radix_tree_insert(&uapi->radix, key, elm);
if (rc) {
kfree(elm);
return ERR_PTR(rc);
}
return elm;
}
static int uapi_merge_method(struct uverbs_api *uapi,
struct uverbs_api_object *obj_elm, u32 obj_key,
const struct uverbs_method_def *method,
bool is_driver)
{
u32 method_key = obj_key | uapi_key_ioctl_method(method->id);
struct uverbs_api_ioctl_method *method_elm;
unsigned int i;
if (!method->attrs)
return 0;
method_elm = uapi_add_elm(uapi, method_key, sizeof(*method_elm));
if (IS_ERR(method_elm)) {
if (method_elm != ERR_PTR(-EEXIST))
return PTR_ERR(method_elm);
/*
* This occurs when a driver uses ADD_UVERBS_ATTRIBUTES_SIMPLE
*/
if (WARN_ON(method->handler))
return -EINVAL;
method_elm = radix_tree_lookup(&uapi->radix, method_key);
if (WARN_ON(!method_elm))
return -EINVAL;
} else {
WARN_ON(!method->handler);
rcu_assign_pointer(method_elm->handler, method->handler);
if (method->handler != uverbs_destroy_def_handler)
method_elm->driver_method = is_driver;
}
for (i = 0; i != method->num_attrs; i++) {
const struct uverbs_attr_def *attr = (*method->attrs)[i];
struct uverbs_api_attr *attr_slot;
if (!attr)
continue;
/*
* ENUM_IN contains the 'ids' pointer to the driver's .rodata,
* so if it is specified by a driver then it always makes this
* into a driver method.
*/
if (attr->attr.type == UVERBS_ATTR_TYPE_ENUM_IN)
method_elm->driver_method |= is_driver;
attr_slot =
uapi_add_elm(uapi, method_key | uapi_key_attr(attr->id),
sizeof(*attr_slot));
/* Attributes are not allowed to be modified by drivers */
if (IS_ERR(attr_slot))
return PTR_ERR(attr_slot);
attr_slot->spec = attr->attr;
}
return 0;
}
static int uapi_merge_tree(struct uverbs_api *uapi,
const struct uverbs_object_tree_def *tree,
bool is_driver)
{
unsigned int i, j;
int rc;
if (!tree->objects)
return 0;
for (i = 0; i != tree->num_objects; i++) {
const struct uverbs_object_def *obj = (*tree->objects)[i];
struct uverbs_api_object *obj_elm;
u32 obj_key;
if (!obj)
continue;
obj_key = uapi_key_obj(obj->id);
obj_elm = uapi_add_elm(uapi, obj_key, sizeof(*obj_elm));
if (IS_ERR(obj_elm)) {
if (obj_elm != ERR_PTR(-EEXIST))
return PTR_ERR(obj_elm);
/* This occurs when a driver uses ADD_UVERBS_METHODS */
if (WARN_ON(obj->type_attrs))
return -EINVAL;
obj_elm = radix_tree_lookup(&uapi->radix, obj_key);
if (WARN_ON(!obj_elm))
return -EINVAL;
} else {
obj_elm->type_attrs = obj->type_attrs;
if (obj->type_attrs) {
obj_elm->type_class =
obj->type_attrs->type_class;
/*
* Today drivers are only permitted to use
* idr_class types. They cannot use FD types
* because we currently have no way to revoke
* the fops pointer after device
* disassociation.
*/
if (WARN_ON(is_driver &&
obj->type_attrs->type_class !=
&uverbs_idr_class))
return -EINVAL;
}
}
if (!obj->methods)
continue;
for (j = 0; j != obj->num_methods; j++) {
const struct uverbs_method_def *method =
(*obj->methods)[j];
if (!method)
continue;
rc = uapi_merge_method(uapi, obj_elm, obj_key, method,
is_driver);
if (rc)
return rc;
}
}
return 0;
}
static int
uapi_finalize_ioctl_method(struct uverbs_api *uapi,
struct uverbs_api_ioctl_method *method_elm,
u32 method_key)
{
struct radix_tree_iter iter;
unsigned int num_attrs = 0;
unsigned int max_bkey = 0;
bool single_uobj = false;
void __rcu **slot;
method_elm->destroy_bkey = UVERBS_API_ATTR_BKEY_LEN;
radix_tree_for_each_slot (slot, &uapi->radix, &iter,
uapi_key_attrs_start(method_key)) {
struct uverbs_api_attr *elm =
rcu_dereference_protected(*slot, true);
u32 attr_key = iter.index & UVERBS_API_ATTR_KEY_MASK;
u32 attr_bkey = uapi_bkey_attr(attr_key);
u8 type = elm->spec.type;
if (uapi_key_attr_to_method(iter.index) !=
uapi_key_attr_to_method(method_key))
break;
if (elm->spec.mandatory)
__set_bit(attr_bkey, method_elm->attr_mandatory);
if (type == UVERBS_ATTR_TYPE_IDR ||
type == UVERBS_ATTR_TYPE_FD) {
u8 access = elm->spec.u.obj.access;
/*
* Verbs specs may only have one NEW/DESTROY, we don't
* have the infrastructure to abort multiple NEW's or
* cope with multiple DESTROY failure.
*/
if (access == UVERBS_ACCESS_NEW ||
access == UVERBS_ACCESS_DESTROY) {
if (WARN_ON(single_uobj))
return -EINVAL;
single_uobj = true;
if (WARN_ON(!elm->spec.mandatory))
return -EINVAL;
}
if (access == UVERBS_ACCESS_DESTROY)
method_elm->destroy_bkey = attr_bkey;
}
max_bkey = max(max_bkey, attr_bkey);
num_attrs++;
}
method_elm->key_bitmap_len = max_bkey + 1;
WARN_ON(method_elm->key_bitmap_len > UVERBS_API_ATTR_BKEY_LEN);
uapi_compute_bundle_size(method_elm, num_attrs);
return 0;
}
static int uapi_finalize(struct uverbs_api *uapi)
{
struct radix_tree_iter iter;
void __rcu **slot;
int rc;
radix_tree_for_each_slot (slot, &uapi->radix, &iter, 0) {
struct uverbs_api_ioctl_method *method_elm =
rcu_dereference_protected(*slot, true);
if (uapi_key_is_ioctl_method(iter.index)) {
rc = uapi_finalize_ioctl_method(uapi, method_elm,
iter.index);
if (rc)
return rc;
}
}
return 0;
}
void uverbs_destroy_api(struct uverbs_api *uapi)
{
struct radix_tree_iter iter;
void __rcu **slot;
if (!uapi)
return;
radix_tree_for_each_slot (slot, &uapi->radix, &iter, 0) {
kfree(rcu_dereference_protected(*slot, true));
radix_tree_iter_delete(&uapi->radix, &iter, slot);
}
}
struct uverbs_api *uverbs_alloc_api(
const struct uverbs_object_tree_def *const *driver_specs,
enum rdma_driver_id driver_id)
{
struct uverbs_api *uapi;
int rc;
uapi = kzalloc(sizeof(*uapi), GFP_KERNEL);
if (!uapi)
return ERR_PTR(-ENOMEM);
INIT_RADIX_TREE(&uapi->radix, GFP_KERNEL);
uapi->driver_id = driver_id;
rc = uapi_merge_tree(uapi, uverbs_default_get_objects(), false);
if (rc)
goto err;
for (; driver_specs && *driver_specs; driver_specs++) {
rc = uapi_merge_tree(uapi, *driver_specs, true);
if (rc)
goto err;
}
rc = uapi_finalize(uapi);
if (rc)
goto err;
return uapi;
err:
if (rc != -ENOMEM)
pr_err("Setup of uverbs_api failed, kernel parsing tree description is not valid (%d)??\n",
rc);
uverbs_destroy_api(uapi);
return ERR_PTR(rc);
}
/*
* The pre version is done before destroying the HW objects, it only blocks
* off method access. All methods that require the ib_dev or the module data
* must test one of these assignments prior to continuing.
*/
void uverbs_disassociate_api_pre(struct ib_uverbs_device *uverbs_dev)
{
struct uverbs_api *uapi = uverbs_dev->uapi;
struct radix_tree_iter iter;
void __rcu **slot;
rcu_assign_pointer(uverbs_dev->ib_dev, NULL);
radix_tree_for_each_slot (slot, &uapi->radix, &iter, 0) {
if (uapi_key_is_ioctl_method(iter.index)) {
struct uverbs_api_ioctl_method *method_elm =
rcu_dereference_protected(*slot, true);
if (method_elm->driver_method)
rcu_assign_pointer(method_elm->handler, NULL);
}
}
synchronize_srcu(&uverbs_dev->disassociate_srcu);
}
/*
* Called when a driver disassociates from the ib_uverbs_device. The
* assumption is that the driver module will unload after. Replace everything
* related to the driver with NULL as a safety measure.
*/
void uverbs_disassociate_api(struct uverbs_api *uapi)
{
struct radix_tree_iter iter;
void __rcu **slot;
radix_tree_for_each_slot (slot, &uapi->radix, &iter, 0) {
if (uapi_key_is_object(iter.index)) {
struct uverbs_api_object *object_elm =
rcu_dereference_protected(*slot, true);
/*
* Some type_attrs are in the driver module. We don't
* bother to keep track of which since there should be
* no use of this after disassociate.
*/
object_elm->type_attrs = NULL;
} else if (uapi_key_is_attr(iter.index)) {
struct uverbs_api_attr *elm =
rcu_dereference_protected(*slot, true);
if (elm->spec.type == UVERBS_ATTR_TYPE_ENUM_IN)
elm->spec.u2.enum_def.ids = NULL;
}
}
}

View File

@ -326,12 +326,162 @@ EXPORT_SYMBOL(ib_dealloc_pd);
/* Address handles */
/**
* rdma_copy_ah_attr - Copy rdma ah attribute from source to destination.
* @dest: Pointer to destination ah_attr. Contents of the destination
* pointer is assumed to be invalid and attribute are overwritten.
* @src: Pointer to source ah_attr.
*/
void rdma_copy_ah_attr(struct rdma_ah_attr *dest,
const struct rdma_ah_attr *src)
{
*dest = *src;
if (dest->grh.sgid_attr)
rdma_hold_gid_attr(dest->grh.sgid_attr);
}
EXPORT_SYMBOL(rdma_copy_ah_attr);
/**
* rdma_replace_ah_attr - Replace valid ah_attr with new new one.
* @old: Pointer to existing ah_attr which needs to be replaced.
* old is assumed to be valid or zero'd
* @new: Pointer to the new ah_attr.
*
* rdma_replace_ah_attr() first releases any reference in the old ah_attr if
* old the ah_attr is valid; after that it copies the new attribute and holds
* the reference to the replaced ah_attr.
*/
void rdma_replace_ah_attr(struct rdma_ah_attr *old,
const struct rdma_ah_attr *new)
{
rdma_destroy_ah_attr(old);
*old = *new;
if (old->grh.sgid_attr)
rdma_hold_gid_attr(old->grh.sgid_attr);
}
EXPORT_SYMBOL(rdma_replace_ah_attr);
/**
* rdma_move_ah_attr - Move ah_attr pointed by source to destination.
* @dest: Pointer to destination ah_attr to copy to.
* dest is assumed to be valid or zero'd
* @src: Pointer to the new ah_attr.
*
* rdma_move_ah_attr() first releases any reference in the destination ah_attr
* if it is valid. This also transfers ownership of internal references from
* src to dest, making src invalid in the process. No new reference of the src
* ah_attr is taken.
*/
void rdma_move_ah_attr(struct rdma_ah_attr *dest, struct rdma_ah_attr *src)
{
rdma_destroy_ah_attr(dest);
*dest = *src;
src->grh.sgid_attr = NULL;
}
EXPORT_SYMBOL(rdma_move_ah_attr);
/*
* Validate that the rdma_ah_attr is valid for the device before passing it
* off to the driver.
*/
static int rdma_check_ah_attr(struct ib_device *device,
struct rdma_ah_attr *ah_attr)
{
if (!rdma_is_port_valid(device, ah_attr->port_num))
return -EINVAL;
if ((rdma_is_grh_required(device, ah_attr->port_num) ||
ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) &&
!(ah_attr->ah_flags & IB_AH_GRH))
return -EINVAL;
if (ah_attr->grh.sgid_attr) {
/*
* Make sure the passed sgid_attr is consistent with the
* parameters
*/
if (ah_attr->grh.sgid_attr->index != ah_attr->grh.sgid_index ||
ah_attr->grh.sgid_attr->port_num != ah_attr->port_num)
return -EINVAL;
}
return 0;
}
/*
* If the ah requires a GRH then ensure that sgid_attr pointer is filled in.
* On success the caller is responsible to call rdma_unfill_sgid_attr().
*/
static int rdma_fill_sgid_attr(struct ib_device *device,
struct rdma_ah_attr *ah_attr,
const struct ib_gid_attr **old_sgid_attr)
{
const struct ib_gid_attr *sgid_attr;
struct ib_global_route *grh;
int ret;
*old_sgid_attr = ah_attr->grh.sgid_attr;
ret = rdma_check_ah_attr(device, ah_attr);
if (ret)
return ret;
if (!(ah_attr->ah_flags & IB_AH_GRH))
return 0;
grh = rdma_ah_retrieve_grh(ah_attr);
if (grh->sgid_attr)
return 0;
sgid_attr =
rdma_get_gid_attr(device, ah_attr->port_num, grh->sgid_index);
if (IS_ERR(sgid_attr))
return PTR_ERR(sgid_attr);
/* Move ownerhip of the kref into the ah_attr */
grh->sgid_attr = sgid_attr;
return 0;
}
static void rdma_unfill_sgid_attr(struct rdma_ah_attr *ah_attr,
const struct ib_gid_attr *old_sgid_attr)
{
/*
* Fill didn't change anything, the caller retains ownership of
* whatever it passed
*/
if (ah_attr->grh.sgid_attr == old_sgid_attr)
return;
/*
* Otherwise, we need to undo what rdma_fill_sgid_attr so the caller
* doesn't see any change in the rdma_ah_attr. If we get here
* old_sgid_attr is NULL.
*/
rdma_destroy_ah_attr(ah_attr);
}
static const struct ib_gid_attr *
rdma_update_sgid_attr(struct rdma_ah_attr *ah_attr,
const struct ib_gid_attr *old_attr)
{
if (old_attr)
rdma_put_gid_attr(old_attr);
if (ah_attr->ah_flags & IB_AH_GRH) {
rdma_hold_gid_attr(ah_attr->grh.sgid_attr);
return ah_attr->grh.sgid_attr;
}
return NULL;
}
static struct ib_ah *_rdma_create_ah(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
struct ib_udata *udata)
{
struct ib_ah *ah;
if (!pd->device->create_ah)
return ERR_PTR(-EOPNOTSUPP);
ah = pd->device->create_ah(pd, ah_attr, udata);
if (!IS_ERR(ah)) {
@ -339,15 +489,38 @@ static struct ib_ah *_rdma_create_ah(struct ib_pd *pd,
ah->pd = pd;
ah->uobject = NULL;
ah->type = ah_attr->type;
ah->sgid_attr = rdma_update_sgid_attr(ah_attr, NULL);
atomic_inc(&pd->usecnt);
}
return ah;
}
/**
* rdma_create_ah - Creates an address handle for the
* given address vector.
* @pd: The protection domain associated with the address handle.
* @ah_attr: The attributes of the address vector.
*
* It returns 0 on success and returns appropriate error code on error.
* The address handle is used to reference a local or global destination
* in all UD QP post sends.
*/
struct ib_ah *rdma_create_ah(struct ib_pd *pd, struct rdma_ah_attr *ah_attr)
{
return _rdma_create_ah(pd, ah_attr, NULL);
const struct ib_gid_attr *old_sgid_attr;
struct ib_ah *ah;
int ret;
ret = rdma_fill_sgid_attr(pd->device, ah_attr, &old_sgid_attr);
if (ret)
return ERR_PTR(ret);
ah = _rdma_create_ah(pd, ah_attr, NULL);
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
return ah;
}
EXPORT_SYMBOL(rdma_create_ah);
@ -368,15 +541,27 @@ struct ib_ah *rdma_create_user_ah(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
struct ib_udata *udata)
{
const struct ib_gid_attr *old_sgid_attr;
struct ib_ah *ah;
int err;
err = rdma_fill_sgid_attr(pd->device, ah_attr, &old_sgid_attr);
if (err)
return ERR_PTR(err);
if (ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) {
err = ib_resolve_eth_dmac(pd->device, ah_attr);
if (err)
return ERR_PTR(err);
if (err) {
ah = ERR_PTR(err);
goto out;
}
}
return _rdma_create_ah(pd, ah_attr, udata);
ah = _rdma_create_ah(pd, ah_attr, udata);
out:
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
return ah;
}
EXPORT_SYMBOL(rdma_create_user_ah);
@ -455,16 +640,16 @@ static bool find_gid_index(const union ib_gid *gid,
return true;
}
static int get_sgid_index_from_eth(struct ib_device *device, u8 port_num,
static const struct ib_gid_attr *
get_sgid_attr_from_eth(struct ib_device *device, u8 port_num,
u16 vlan_id, const union ib_gid *sgid,
enum ib_gid_type gid_type,
u16 *gid_index)
enum ib_gid_type gid_type)
{
struct find_gid_index_context context = {.vlan_id = vlan_id,
.gid_type = gid_type};
return ib_find_gid_by_filter(device, sgid, port_num, find_gid_index,
&context, gid_index);
return rdma_find_gid_by_filter(device, sgid, port_num, find_gid_index,
&context);
}
int ib_get_gids_from_rdma_hdr(const union rdma_network_hdr *hdr,
@ -508,39 +693,24 @@ EXPORT_SYMBOL(ib_get_gids_from_rdma_hdr);
static int ib_resolve_unicast_gid_dmac(struct ib_device *device,
struct rdma_ah_attr *ah_attr)
{
struct ib_gid_attr sgid_attr;
struct ib_global_route *grh;
struct ib_global_route *grh = rdma_ah_retrieve_grh(ah_attr);
const struct ib_gid_attr *sgid_attr = grh->sgid_attr;
int hop_limit = 0xff;
union ib_gid sgid;
int ret;
grh = rdma_ah_retrieve_grh(ah_attr);
ret = ib_query_gid(device,
rdma_ah_get_port_num(ah_attr),
grh->sgid_index,
&sgid, &sgid_attr);
if (ret || !sgid_attr.ndev) {
if (!ret)
ret = -ENXIO;
return ret;
}
int ret = 0;
/* If destination is link local and source GID is RoCEv1,
* IP stack is not used.
*/
if (rdma_link_local_addr((struct in6_addr *)grh->dgid.raw) &&
sgid_attr.gid_type == IB_GID_TYPE_ROCE) {
sgid_attr->gid_type == IB_GID_TYPE_ROCE) {
rdma_get_ll_mac((struct in6_addr *)grh->dgid.raw,
ah_attr->roce.dmac);
goto done;
return ret;
}
ret = rdma_addr_find_l2_eth_by_grh(&sgid, &grh->dgid,
ret = rdma_addr_find_l2_eth_by_grh(&sgid_attr->gid, &grh->dgid,
ah_attr->roce.dmac,
sgid_attr.ndev, &hop_limit);
done:
dev_put(sgid_attr.ndev);
sgid_attr->ndev, &hop_limit);
grh->hop_limit = hop_limit;
return ret;
@ -555,16 +725,18 @@ done:
* as sgid and, sgid is used as dgid because sgid contains destinations
* GID whom to respond to.
*
* On success the caller is responsible to call rdma_destroy_ah_attr on the
* attr.
*/
int ib_init_ah_attr_from_wc(struct ib_device *device, u8 port_num,
const struct ib_wc *wc, const struct ib_grh *grh,
struct rdma_ah_attr *ah_attr)
{
u32 flow_class;
u16 gid_index;
int ret;
enum rdma_network_type net_type = RDMA_NETWORK_IB;
enum ib_gid_type gid_type = IB_GID_TYPE_IB;
const struct ib_gid_attr *sgid_attr;
int hoplimit = 0xff;
union ib_gid dgid;
union ib_gid sgid;
@ -595,72 +767,141 @@ int ib_init_ah_attr_from_wc(struct ib_device *device, u8 port_num,
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;
ret = get_sgid_index_from_eth(device, port_num,
sgid_attr = get_sgid_attr_from_eth(device, port_num,
vlan_id, &dgid,
gid_type, &gid_index);
if (ret)
return ret;
gid_type);
if (IS_ERR(sgid_attr))
return PTR_ERR(sgid_attr);
flow_class = be32_to_cpu(grh->version_tclass_flow);
rdma_ah_set_grh(ah_attr, &sgid,
rdma_move_grh_sgid_attr(ah_attr,
&sgid,
flow_class & 0xFFFFF,
(u8)gid_index, hoplimit,
(flow_class >> 20) & 0xFF);
return ib_resolve_unicast_gid_dmac(device, ah_attr);
hoplimit,
(flow_class >> 20) & 0xFF,
sgid_attr);
ret = ib_resolve_unicast_gid_dmac(device, ah_attr);
if (ret)
rdma_destroy_ah_attr(ah_attr);
return ret;
} else {
rdma_ah_set_dlid(ah_attr, wc->slid);
rdma_ah_set_path_bits(ah_attr, wc->dlid_path_bits);
if (wc->wc_flags & IB_WC_GRH) {
if (dgid.global.interface_id != cpu_to_be64(IB_SA_WELL_KNOWN_GUID)) {
ret = ib_find_cached_gid_by_port(device, &dgid,
IB_GID_TYPE_IB,
port_num, NULL,
&gid_index);
if (ret)
return ret;
} else {
gid_index = 0;
}
if ((wc->wc_flags & IB_WC_GRH) == 0)
return 0;
if (dgid.global.interface_id !=
cpu_to_be64(IB_SA_WELL_KNOWN_GUID)) {
sgid_attr = rdma_find_gid_by_port(
device, &dgid, IB_GID_TYPE_IB, port_num, NULL);
} else
sgid_attr = rdma_get_gid_attr(device, port_num, 0);
if (IS_ERR(sgid_attr))
return PTR_ERR(sgid_attr);
flow_class = be32_to_cpu(grh->version_tclass_flow);
rdma_ah_set_grh(ah_attr, &sgid,
rdma_move_grh_sgid_attr(ah_attr,
&sgid,
flow_class & 0xFFFFF,
(u8)gid_index, hoplimit,
(flow_class >> 20) & 0xFF);
}
hoplimit,
(flow_class >> 20) & 0xFF,
sgid_attr);
return 0;
}
}
EXPORT_SYMBOL(ib_init_ah_attr_from_wc);
/**
* rdma_move_grh_sgid_attr - Sets the sgid attribute of GRH, taking ownership
* of the reference
*
* @attr: Pointer to AH attribute structure
* @dgid: Destination GID
* @flow_label: Flow label
* @hop_limit: Hop limit
* @traffic_class: traffic class
* @sgid_attr: Pointer to SGID attribute
*
* This takes ownership of the sgid_attr reference. The caller must ensure
* rdma_destroy_ah_attr() is called before destroying the rdma_ah_attr after
* calling this function.
*/
void rdma_move_grh_sgid_attr(struct rdma_ah_attr *attr, union ib_gid *dgid,
u32 flow_label, u8 hop_limit, u8 traffic_class,
const struct ib_gid_attr *sgid_attr)
{
rdma_ah_set_grh(attr, dgid, flow_label, sgid_attr->index, hop_limit,
traffic_class);
attr->grh.sgid_attr = sgid_attr;
}
EXPORT_SYMBOL(rdma_move_grh_sgid_attr);
/**
* rdma_destroy_ah_attr - Release reference to SGID attribute of
* ah attribute.
* @ah_attr: Pointer to ah attribute
*
* Release reference to the SGID attribute of the ah attribute if it is
* non NULL. It is safe to call this multiple times, and safe to call it on
* a zero initialized ah_attr.
*/
void rdma_destroy_ah_attr(struct rdma_ah_attr *ah_attr)
{
if (ah_attr->grh.sgid_attr) {
rdma_put_gid_attr(ah_attr->grh.sgid_attr);
ah_attr->grh.sgid_attr = NULL;
}
}
EXPORT_SYMBOL(rdma_destroy_ah_attr);
struct ib_ah *ib_create_ah_from_wc(struct ib_pd *pd, const struct ib_wc *wc,
const struct ib_grh *grh, u8 port_num)
{
struct rdma_ah_attr ah_attr;
struct ib_ah *ah;
int ret;
ret = ib_init_ah_attr_from_wc(pd->device, port_num, wc, grh, &ah_attr);
if (ret)
return ERR_PTR(ret);
return rdma_create_ah(pd, &ah_attr);
ah = rdma_create_ah(pd, &ah_attr);
rdma_destroy_ah_attr(&ah_attr);
return ah;
}
EXPORT_SYMBOL(ib_create_ah_from_wc);
int rdma_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr)
{
const struct ib_gid_attr *old_sgid_attr;
int ret;
if (ah->type != ah_attr->type)
return -EINVAL;
return ah->device->modify_ah ?
ret = rdma_fill_sgid_attr(ah->device, ah_attr, &old_sgid_attr);
if (ret)
return ret;
ret = ah->device->modify_ah ?
ah->device->modify_ah(ah, ah_attr) :
-EOPNOTSUPP;
ah->sgid_attr = rdma_update_sgid_attr(ah_attr, ah->sgid_attr);
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
return ret;
}
EXPORT_SYMBOL(rdma_modify_ah);
int rdma_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr)
{
ah_attr->grh.sgid_attr = NULL;
return ah->device->query_ah ?
ah->device->query_ah(ah, ah_attr) :
-EOPNOTSUPP;
@ -669,13 +910,17 @@ EXPORT_SYMBOL(rdma_query_ah);
int rdma_destroy_ah(struct ib_ah *ah)
{
const struct ib_gid_attr *sgid_attr = ah->sgid_attr;
struct ib_pd *pd;
int ret;
pd = ah->pd;
ret = ah->device->destroy_ah(ah);
if (!ret)
if (!ret) {
atomic_dec(&pd->usecnt);
if (sgid_attr)
rdma_put_gid_attr(sgid_attr);
}
return ret;
}
@ -1290,16 +1535,19 @@ bool ib_modify_qp_is_ok(enum ib_qp_state cur_state, enum ib_qp_state next_state,
}
EXPORT_SYMBOL(ib_modify_qp_is_ok);
/**
* ib_resolve_eth_dmac - Resolve destination mac address
* @device: Device to consider
* @ah_attr: address handle attribute which describes the
* source and destination parameters
* ib_resolve_eth_dmac() resolves destination mac address and L3 hop limit It
* returns 0 on success or appropriate error code. It initializes the
* necessary ah_attr fields when call is successful.
*/
static int ib_resolve_eth_dmac(struct ib_device *device,
struct rdma_ah_attr *ah_attr)
{
int ret = 0;
struct ib_global_route *grh;
if (!rdma_is_port_valid(device, rdma_ah_get_port_num(ah_attr)))
return -EINVAL;
grh = rdma_ah_retrieve_grh(ah_attr);
if (rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) {
if (ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw)) {
@ -1317,6 +1565,14 @@ static int ib_resolve_eth_dmac(struct ib_device *device,
return ret;
}
static bool is_qp_type_connected(const struct ib_qp *qp)
{
return (qp->qp_type == IB_QPT_UC ||
qp->qp_type == IB_QPT_RC ||
qp->qp_type == IB_QPT_XRC_INI ||
qp->qp_type == IB_QPT_XRC_TGT);
}
/**
* IB core internal function to perform QP attributes modification.
*/
@ -1324,8 +1580,53 @@ static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
{
u8 port = attr_mask & IB_QP_PORT ? attr->port_num : qp->port;
const struct ib_gid_attr *old_sgid_attr_av;
const struct ib_gid_attr *old_sgid_attr_alt_av;
int ret;
if (attr_mask & IB_QP_AV) {
ret = rdma_fill_sgid_attr(qp->device, &attr->ah_attr,
&old_sgid_attr_av);
if (ret)
return ret;
}
if (attr_mask & IB_QP_ALT_PATH) {
/*
* FIXME: This does not track the migration state, so if the
* user loads a new alternate path after the HW has migrated
* from primary->alternate we will keep the wrong
* references. This is OK for IB because the reference
* counting does not serve any functional purpose.
*/
ret = rdma_fill_sgid_attr(qp->device, &attr->alt_ah_attr,
&old_sgid_attr_alt_av);
if (ret)
goto out_av;
/*
* Today the core code can only handle alternate paths and APM
* for IB. Ban them in roce mode.
*/
if (!(rdma_protocol_ib(qp->device,
attr->alt_ah_attr.port_num) &&
rdma_protocol_ib(qp->device, port))) {
ret = EINVAL;
goto out;
}
}
/*
* If the user provided the qp_attr then we have to resolve it. Kernel
* users have to provide already resolved rdma_ah_attr's
*/
if (udata && (attr_mask & IB_QP_AV) &&
attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE &&
is_qp_type_connected(qp)) {
ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr);
if (ret)
goto out;
}
if (rdma_ib_or_roce(qp->device, port)) {
if (attr_mask & IB_QP_RQ_PSN && attr->rq_psn & ~0xffffff) {
pr_warn("%s: %s rq_psn overflow, masking to 24 bits\n",
@ -1341,20 +1642,27 @@ static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
}
ret = ib_security_modify_qp(qp, attr, attr_mask, udata);
if (!ret && (attr_mask & IB_QP_PORT))
if (ret)
goto out;
if (attr_mask & IB_QP_PORT)
qp->port = attr->port_num;
if (attr_mask & IB_QP_AV)
qp->av_sgid_attr =
rdma_update_sgid_attr(&attr->ah_attr, qp->av_sgid_attr);
if (attr_mask & IB_QP_ALT_PATH)
qp->alt_path_sgid_attr = rdma_update_sgid_attr(
&attr->alt_ah_attr, qp->alt_path_sgid_attr);
out:
if (attr_mask & IB_QP_ALT_PATH)
rdma_unfill_sgid_attr(&attr->alt_ah_attr, old_sgid_attr_alt_av);
out_av:
if (attr_mask & IB_QP_AV)
rdma_unfill_sgid_attr(&attr->ah_attr, old_sgid_attr_av);
return ret;
}
static bool is_qp_type_connected(const struct ib_qp *qp)
{
return (qp->qp_type == IB_QPT_UC ||
qp->qp_type == IB_QPT_RC ||
qp->qp_type == IB_QPT_XRC_INI ||
qp->qp_type == IB_QPT_XRC_TGT);
}
/**
* ib_modify_qp_with_udata - Modifies the attributes for the specified QP.
* @ib_qp: The QP to modify.
@ -1369,17 +1677,7 @@ static bool is_qp_type_connected(const struct ib_qp *qp)
int ib_modify_qp_with_udata(struct ib_qp *ib_qp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
{
struct ib_qp *qp = ib_qp->real_qp;
int ret;
if (attr_mask & IB_QP_AV &&
attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE &&
is_qp_type_connected(qp)) {
ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr);
if (ret)
return ret;
}
return _ib_modify_qp(qp, attr, attr_mask, udata);
return _ib_modify_qp(ib_qp->real_qp, attr, attr_mask, udata);
}
EXPORT_SYMBOL(ib_modify_qp_with_udata);
@ -1451,6 +1749,9 @@ int ib_query_qp(struct ib_qp *qp,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
{
qp_attr->ah_attr.grh.sgid_attr = NULL;
qp_attr->alt_ah_attr.grh.sgid_attr = NULL;
return qp->device->query_qp ?
qp->device->query_qp(qp->real_qp, qp_attr, qp_attr_mask, qp_init_attr) :
-EOPNOTSUPP;
@ -1509,6 +1810,8 @@ static int __ib_destroy_shared_qp(struct ib_qp *qp)
int ib_destroy_qp(struct ib_qp *qp)
{
const struct ib_gid_attr *alt_path_sgid_attr = qp->alt_path_sgid_attr;
const struct ib_gid_attr *av_sgid_attr = qp->av_sgid_attr;
struct ib_pd *pd;
struct ib_cq *scq, *rcq;
struct ib_srq *srq;
@ -1539,6 +1842,10 @@ int ib_destroy_qp(struct ib_qp *qp)
rdma_restrack_del(&qp->res);
ret = qp->device->destroy_qp(qp);
if (!ret) {
if (alt_path_sgid_attr)
rdma_put_gid_attr(alt_path_sgid_attr);
if (av_sgid_attr)
rdma_put_gid_attr(av_sgid_attr);
if (pd)
atomic_dec(&pd->usecnt);
if (scq)
@ -1977,35 +2284,6 @@ int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *rwq_ind_table)
}
EXPORT_SYMBOL(ib_destroy_rwq_ind_table);
struct ib_flow *ib_create_flow(struct ib_qp *qp,
struct ib_flow_attr *flow_attr,
int domain)
{
struct ib_flow *flow_id;
if (!qp->device->create_flow)
return ERR_PTR(-EOPNOTSUPP);
flow_id = qp->device->create_flow(qp, flow_attr, domain, NULL);
if (!IS_ERR(flow_id)) {
atomic_inc(&qp->usecnt);
flow_id->qp = qp;
}
return flow_id;
}
EXPORT_SYMBOL(ib_create_flow);
int ib_destroy_flow(struct ib_flow *flow_id)
{
int err;
struct ib_qp *qp = flow_id->qp;
err = qp->device->destroy_flow(flow_id);
if (!err)
atomic_dec(&qp->usecnt);
return err;
}
EXPORT_SYMBOL(ib_destroy_flow);
int ib_check_mr_status(struct ib_mr *mr, u32 check_mask,
struct ib_mr_status *mr_status)
{
@ -2200,7 +2478,6 @@ static void __ib_drain_sq(struct ib_qp *qp)
struct ib_cq *cq = qp->send_cq;
struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
struct ib_drain_cqe sdrain;
struct ib_send_wr *bad_swr;
struct ib_rdma_wr swr = {
.wr = {
.next = NULL,
@ -2219,7 +2496,7 @@ static void __ib_drain_sq(struct ib_qp *qp)
sdrain.cqe.done = ib_drain_qp_done;
init_completion(&sdrain.done);
ret = ib_post_send(qp, &swr.wr, &bad_swr);
ret = ib_post_send(qp, &swr.wr, NULL);
if (ret) {
WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
return;
@ -2240,7 +2517,7 @@ static void __ib_drain_rq(struct ib_qp *qp)
struct ib_cq *cq = qp->recv_cq;
struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
struct ib_drain_cqe rdrain;
struct ib_recv_wr rwr = {}, *bad_rwr;
struct ib_recv_wr rwr = {};
int ret;
ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
@ -2253,7 +2530,7 @@ static void __ib_drain_rq(struct ib_qp *qp)
rdrain.cqe.done = ib_drain_qp_done;
init_completion(&rdrain.done);
ret = ib_post_recv(qp, &rwr, &bad_rwr);
ret = ib_post_recv(qp, &rwr, NULL);
if (ret) {
WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret);
return;

View File

@ -166,7 +166,8 @@ int bnxt_re_query_device(struct ib_device *ibdev,
| IB_DEVICE_MEM_WINDOW
| IB_DEVICE_MEM_WINDOW_TYPE_2B
| IB_DEVICE_MEM_MGT_EXTENSIONS;
ib_attr->max_sge = dev_attr->max_qp_sges;
ib_attr->max_send_sge = dev_attr->max_qp_sges;
ib_attr->max_recv_sge = dev_attr->max_qp_sges;
ib_attr->max_sge_rd = dev_attr->max_qp_sges;
ib_attr->max_cq = dev_attr->max_cq;
ib_attr->max_cqe = dev_attr->max_cq_wqes;
@ -243,8 +244,8 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
port_attr->gid_tbl_len = dev_attr->max_sgid;
port_attr->port_cap_flags = IB_PORT_CM_SUP | IB_PORT_REINIT_SUP |
IB_PORT_DEVICE_MGMT_SUP |
IB_PORT_VENDOR_CLASS_SUP |
IB_PORT_IP_BASED_GIDS;
IB_PORT_VENDOR_CLASS_SUP;
port_attr->ip_gids = true;
port_attr->max_msg_sz = (u32)BNXT_RE_MAX_MR_SIZE_LOW;
port_attr->bad_pkey_cntr = 0;
@ -364,8 +365,7 @@ int bnxt_re_del_gid(const struct ib_gid_attr *attr, void **context)
return rc;
}
int bnxt_re_add_gid(const union ib_gid *gid,
const struct ib_gid_attr *attr, void **context)
int bnxt_re_add_gid(const struct ib_gid_attr *attr, void **context)
{
int rc;
u32 tbl_idx = 0;
@ -377,7 +377,7 @@ int bnxt_re_add_gid(const union ib_gid *gid,
if ((attr->ndev) && is_vlan_dev(attr->ndev))
vlan_id = vlan_dev_vlan_id(attr->ndev);
rc = bnxt_qplib_add_sgid(sgid_tbl, (struct bnxt_qplib_gid *)gid,
rc = bnxt_qplib_add_sgid(sgid_tbl, (struct bnxt_qplib_gid *)&attr->gid,
rdev->qplib_res.netdev->dev_addr,
vlan_id, true, &tbl_idx);
if (rc == -EALREADY) {
@ -673,8 +673,6 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *ib_pd,
int rc;
u8 nw_type;
struct ib_gid_attr sgid_attr;
if (!(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH)) {
dev_err(rdev_to_dev(rdev), "Failed to alloc AH: GRH not set");
return ERR_PTR(-EINVAL);
@ -705,20 +703,11 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *ib_pd,
grh->dgid.raw) &&
!rdma_link_local_addr((struct in6_addr *)
grh->dgid.raw)) {
union ib_gid sgid;
const struct ib_gid_attr *sgid_attr;
rc = ib_get_cached_gid(&rdev->ibdev, 1,
grh->sgid_index, &sgid,
&sgid_attr);
if (rc) {
dev_err(rdev_to_dev(rdev),
"Failed to query gid at index %d",
grh->sgid_index);
goto fail;
}
dev_put(sgid_attr.ndev);
sgid_attr = grh->sgid_attr;
/* Get network header type for this GID */
nw_type = ib_gid_to_network_type(sgid_attr.gid_type, &sgid);
nw_type = rdma_gid_attr_network_type(sgid_attr);
switch (nw_type) {
case RDMA_NETWORK_IPV4:
ah->qplib_ah.nw_type = CMDQ_CREATE_AH_TYPE_V2IPV4;
@ -1408,7 +1397,7 @@ struct ib_srq *bnxt_re_create_srq(struct ib_pd *ib_pd,
}
if (srq_init_attr->srq_type != IB_SRQT_BASIC) {
rc = -ENOTSUPP;
rc = -EOPNOTSUPP;
goto exit;
}
@ -1530,8 +1519,8 @@ int bnxt_re_query_srq(struct ib_srq *ib_srq, struct ib_srq_attr *srq_attr)
return 0;
}
int bnxt_re_post_srq_recv(struct ib_srq *ib_srq, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
int bnxt_re_post_srq_recv(struct ib_srq *ib_srq, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
@ -1599,9 +1588,6 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
enum ib_qp_state curr_qp_state, new_qp_state;
int rc, entries;
int status;
union ib_gid sgid;
struct ib_gid_attr sgid_attr;
unsigned int flags;
u8 nw_type;
@ -1668,6 +1654,7 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
if (qp_attr_mask & IB_QP_AV) {
const struct ib_global_route *grh =
rdma_ah_read_grh(&qp_attr->ah_attr);
const struct ib_gid_attr *sgid_attr;
qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_DGID |
CMDQ_MODIFY_QP_MODIFY_MASK_FLOW_LABEL |
@ -1691,15 +1678,10 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
ether_addr_copy(qp->qplib_qp.ah.dmac,
qp_attr->ah_attr.roce.dmac);
status = ib_get_cached_gid(&rdev->ibdev, 1,
grh->sgid_index,
&sgid, &sgid_attr);
if (!status) {
memcpy(qp->qplib_qp.smac, sgid_attr.ndev->dev_addr,
sgid_attr = qp_attr->ah_attr.grh.sgid_attr;
memcpy(qp->qplib_qp.smac, sgid_attr->ndev->dev_addr,
ETH_ALEN);
dev_put(sgid_attr.ndev);
nw_type = ib_gid_to_network_type(sgid_attr.gid_type,
&sgid);
nw_type = rdma_gid_attr_network_type(sgid_attr);
switch (nw_type) {
case RDMA_NETWORK_IPV4:
qp->qplib_qp.nw_type =
@ -1715,7 +1697,6 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
break;
}
}
}
if (qp_attr_mask & IB_QP_PATH_MTU) {
qp->qplib_qp.modify_flags |=
@ -1895,19 +1876,17 @@ out:
/* Routine for sending QP1 packets for RoCE V1 an V2
*/
static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
struct ib_send_wr *wr,
const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe,
int payload_size)
{
struct ib_device *ibdev = &qp->rdev->ibdev;
struct bnxt_re_ah *ah = container_of(ud_wr(wr)->ah, struct bnxt_re_ah,
ib_ah);
struct bnxt_qplib_ah *qplib_ah = &ah->qplib_ah;
const struct ib_gid_attr *sgid_attr = ah->ib_ah.sgid_attr;
struct bnxt_qplib_sge sge;
union ib_gid sgid;
u8 nw_type;
u16 ether_type;
struct ib_gid_attr sgid_attr;
union ib_gid dgid;
bool is_eth = false;
bool is_vlan = false;
@ -1920,22 +1899,10 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
memset(&qp->qp1_hdr, 0, sizeof(qp->qp1_hdr));
rc = ib_get_cached_gid(ibdev, 1,
qplib_ah->host_sgid_index, &sgid,
&sgid_attr);
if (rc) {
dev_err(rdev_to_dev(qp->rdev),
"Failed to query gid at index %d",
qplib_ah->host_sgid_index);
return rc;
}
if (sgid_attr.ndev) {
if (is_vlan_dev(sgid_attr.ndev))
vlan_id = vlan_dev_vlan_id(sgid_attr.ndev);
dev_put(sgid_attr.ndev);
}
if (is_vlan_dev(sgid_attr->ndev))
vlan_id = vlan_dev_vlan_id(sgid_attr->ndev);
/* Get network header type for this GID */
nw_type = ib_gid_to_network_type(sgid_attr.gid_type, &sgid);
nw_type = rdma_gid_attr_network_type(sgid_attr);
switch (nw_type) {
case RDMA_NETWORK_IPV4:
nw_type = BNXT_RE_ROCEV2_IPV4_PACKET;
@ -1948,9 +1915,9 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
break;
}
memcpy(&dgid.raw, &qplib_ah->dgid, 16);
is_udp = sgid_attr.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP;
is_udp = sgid_attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP;
if (is_udp) {
if (ipv6_addr_v4mapped((struct in6_addr *)&sgid)) {
if (ipv6_addr_v4mapped((struct in6_addr *)&sgid_attr->gid)) {
ip_version = 4;
ether_type = ETH_P_IP;
} else {
@ -1983,9 +1950,10 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
}
if (is_grh || (ip_version == 6)) {
memcpy(qp->qp1_hdr.grh.source_gid.raw, sgid.raw, sizeof(sgid));
memcpy(qp->qp1_hdr.grh.source_gid.raw, sgid_attr->gid.raw,
sizeof(sgid_attr->gid));
memcpy(qp->qp1_hdr.grh.destination_gid.raw, qplib_ah->dgid.data,
sizeof(sgid));
sizeof(sgid_attr->gid));
qp->qp1_hdr.grh.hop_limit = qplib_ah->hop_limit;
}
@ -1995,7 +1963,7 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
qp->qp1_hdr.ip4.frag_off = htons(IP_DF);
qp->qp1_hdr.ip4.ttl = qplib_ah->hop_limit;
memcpy(&qp->qp1_hdr.ip4.saddr, sgid.raw + 12, 4);
memcpy(&qp->qp1_hdr.ip4.saddr, sgid_attr->gid.raw + 12, 4);
memcpy(&qp->qp1_hdr.ip4.daddr, qplib_ah->dgid.data + 12, 4);
qp->qp1_hdr.ip4.check = ib_ud_ip4_csum(&qp->qp1_hdr);
}
@ -2080,7 +2048,7 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
* and the MAD datagram out to the provided SGE.
*/
static int bnxt_re_build_qp1_shadow_qp_recv(struct bnxt_re_qp *qp,
struct ib_recv_wr *wr,
const struct ib_recv_wr *wr,
struct bnxt_qplib_swqe *wqe,
int payload_size)
{
@ -2125,7 +2093,7 @@ static int is_ud_qp(struct bnxt_re_qp *qp)
}
static int bnxt_re_build_send_wqe(struct bnxt_re_qp *qp,
struct ib_send_wr *wr,
const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
struct bnxt_re_ah *ah = NULL;
@ -2163,7 +2131,7 @@ static int bnxt_re_build_send_wqe(struct bnxt_re_qp *qp,
return 0;
}
static int bnxt_re_build_rdma_wqe(struct ib_send_wr *wr,
static int bnxt_re_build_rdma_wqe(const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
switch (wr->opcode) {
@ -2195,7 +2163,7 @@ static int bnxt_re_build_rdma_wqe(struct ib_send_wr *wr,
return 0;
}
static int bnxt_re_build_atomic_wqe(struct ib_send_wr *wr,
static int bnxt_re_build_atomic_wqe(const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
switch (wr->opcode) {
@ -2222,7 +2190,7 @@ static int bnxt_re_build_atomic_wqe(struct ib_send_wr *wr,
return 0;
}
static int bnxt_re_build_inv_wqe(struct ib_send_wr *wr,
static int bnxt_re_build_inv_wqe(const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
wqe->type = BNXT_QPLIB_SWQE_TYPE_LOCAL_INV;
@ -2241,7 +2209,7 @@ static int bnxt_re_build_inv_wqe(struct ib_send_wr *wr,
return 0;
}
static int bnxt_re_build_reg_wqe(struct ib_reg_wr *wr,
static int bnxt_re_build_reg_wqe(const struct ib_reg_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
struct bnxt_re_mr *mr = container_of(wr->mr, struct bnxt_re_mr, ib_mr);
@ -2283,7 +2251,7 @@ static int bnxt_re_build_reg_wqe(struct ib_reg_wr *wr,
}
static int bnxt_re_copy_inline_data(struct bnxt_re_dev *rdev,
struct ib_send_wr *wr,
const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
/* Copy the inline data to the data field */
@ -2313,7 +2281,7 @@ static int bnxt_re_copy_inline_data(struct bnxt_re_dev *rdev,
}
static int bnxt_re_copy_wr_payload(struct bnxt_re_dev *rdev,
struct ib_send_wr *wr,
const struct ib_send_wr *wr,
struct bnxt_qplib_swqe *wqe)
{
int payload_sz = 0;
@ -2345,7 +2313,7 @@ static void bnxt_ud_qp_hw_stall_workaround(struct bnxt_re_qp *qp)
static int bnxt_re_post_send_shadow_qp(struct bnxt_re_dev *rdev,
struct bnxt_re_qp *qp,
struct ib_send_wr *wr)
const struct ib_send_wr *wr)
{
struct bnxt_qplib_swqe wqe;
int rc = 0, payload_sz = 0;
@ -2393,8 +2361,8 @@ bad:
return rc;
}
int bnxt_re_post_send(struct ib_qp *ib_qp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr)
int bnxt_re_post_send(struct ib_qp *ib_qp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr)
{
struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
struct bnxt_qplib_swqe wqe;
@ -2441,7 +2409,7 @@ int bnxt_re_post_send(struct ib_qp *ib_qp, struct ib_send_wr *wr,
default:
break;
}
/* Fall thru to build the wqe */
/* fall through */
case IB_WR_SEND_WITH_INV:
rc = bnxt_re_build_send_wqe(qp, wr, &wqe);
break;
@ -2493,7 +2461,7 @@ bad:
static int bnxt_re_post_recv_shadow_qp(struct bnxt_re_dev *rdev,
struct bnxt_re_qp *qp,
struct ib_recv_wr *wr)
const struct ib_recv_wr *wr)
{
struct bnxt_qplib_swqe wqe;
int rc = 0;
@ -2526,8 +2494,8 @@ static int bnxt_re_post_recv_shadow_qp(struct bnxt_re_dev *rdev,
return rc;
}
int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
struct bnxt_qplib_swqe wqe;

View File

@ -158,8 +158,7 @@ void bnxt_re_query_fw_str(struct ib_device *ibdev, char *str);
int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num,
u16 index, u16 *pkey);
int bnxt_re_del_gid(const struct ib_gid_attr *attr, void **context);
int bnxt_re_add_gid(const union ib_gid *gid,
const struct ib_gid_attr *attr, void **context);
int bnxt_re_add_gid(const struct ib_gid_attr *attr, void **context);
int bnxt_re_query_gid(struct ib_device *ibdev, u8 port_num,
int index, union ib_gid *gid);
enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev,
@ -182,8 +181,8 @@ int bnxt_re_modify_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr,
struct ib_udata *udata);
int bnxt_re_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
int bnxt_re_destroy_srq(struct ib_srq *srq);
int bnxt_re_post_srq_recv(struct ib_srq *srq, struct ib_recv_wr *recv_wr,
struct ib_recv_wr **bad_recv_wr);
int bnxt_re_post_srq_recv(struct ib_srq *srq, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata);
@ -192,10 +191,10 @@ int bnxt_re_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
int bnxt_re_destroy_qp(struct ib_qp *qp);
int bnxt_re_post_send(struct ib_qp *qp, struct ib_send_wr *send_wr,
struct ib_send_wr **bad_send_wr);
int bnxt_re_post_recv(struct ib_qp *qp, struct ib_recv_wr *recv_wr,
struct ib_recv_wr **bad_recv_wr);
int bnxt_re_post_send(struct ib_qp *qp, const struct ib_send_wr *send_wr,
const struct ib_send_wr **bad_send_wr);
int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,

View File

@ -2354,7 +2354,7 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
srq = qp->srq;
if (!srq)
return -EINVAL;
if (wr_id_idx > srq->hwq.max_elements) {
if (wr_id_idx >= srq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process RC ");
dev_err(&cq->hwq.pdev->dev,
@ -2369,7 +2369,7 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
*pcqe = cqe;
} else {
rq = &qp->rq;
if (wr_id_idx > rq->hwq.max_elements) {
if (wr_id_idx >= rq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process RC ");
dev_err(&cq->hwq.pdev->dev,
@ -2437,7 +2437,7 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
if (!srq)
return -EINVAL;
if (wr_id_idx > srq->hwq.max_elements) {
if (wr_id_idx >= srq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process UD ");
dev_err(&cq->hwq.pdev->dev,
@ -2452,7 +2452,7 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
*pcqe = cqe;
} else {
rq = &qp->rq;
if (wr_id_idx > rq->hwq.max_elements) {
if (wr_id_idx >= rq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process UD ");
dev_err(&cq->hwq.pdev->dev,
@ -2546,7 +2546,7 @@ static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
"QPLIB: FP: SRQ used but not defined??");
return -EINVAL;
}
if (wr_id_idx > srq->hwq.max_elements) {
if (wr_id_idx >= srq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process Raw/QP1 ");
dev_err(&cq->hwq.pdev->dev,
@ -2561,7 +2561,7 @@ static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
*pcqe = cqe;
} else {
rq = &qp->rq;
if (wr_id_idx > rq->hwq.max_elements) {
if (wr_id_idx >= rq->hwq.max_elements) {
dev_err(&cq->hwq.pdev->dev,
"QPLIB: FP: CQ Process Raw/QP1 RQ wr_id ");
dev_err(&cq->hwq.pdev->dev,

View File

@ -197,7 +197,7 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res,
struct bnxt_qplib_sgid_tbl *sgid_tbl, int index,
struct bnxt_qplib_gid *gid)
{
if (index > sgid_tbl->max) {
if (index >= sgid_tbl->max) {
dev_err(&res->pdev->dev,
"QPLIB: Index %d exceeded SGID table max (%d)",
index, sgid_tbl->max);
@ -402,7 +402,7 @@ int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
*pkey = 0xFFFF;
return 0;
}
if (index > pkey_tbl->max) {
if (index >= pkey_tbl->max) {
dev_err(&res->pdev->dev,
"QPLIB: Index %d exceeded PKEY table max (%d)",
index, pkey_tbl->max);

View File

@ -32,38 +32,16 @@
#include "iwch_provider.h"
#include "iwch.h"
/*
* Get one cq entry from cxio and map it to openib.
*
* Returns:
* 0 EMPTY;
* 1 cqe returned
* -EAGAIN caller must try again
* any other -errno fatal error
*/
static int iwch_poll_cq_one(struct iwch_dev *rhp, struct iwch_cq *chp,
struct ib_wc *wc)
static int __iwch_poll_cq_one(struct iwch_dev *rhp, struct iwch_cq *chp,
struct iwch_qp *qhp, struct ib_wc *wc)
{
struct iwch_qp *qhp = NULL;
struct t3_cqe cqe, *rd_cqe;
struct t3_wq *wq;
struct t3_wq *wq = qhp ? &qhp->wq : NULL;
struct t3_cqe cqe;
u32 credit = 0;
u8 cqe_flushed;
u64 cookie;
int ret = 1;
rd_cqe = cxio_next_cqe(&chp->cq);
if (!rd_cqe)
return 0;
qhp = get_qhp(rhp, CQE_QPID(*rd_cqe));
if (!qhp)
wq = NULL;
else {
spin_lock(&qhp->lock);
wq = &(qhp->wq);
}
ret = cxio_poll_cq(wq, &(chp->cq), &cqe, &cqe_flushed, &cookie,
&credit);
if (t3a_device(chp->rhp) && credit) {
@ -79,7 +57,7 @@ static int iwch_poll_cq_one(struct iwch_dev *rhp, struct iwch_cq *chp,
ret = 1;
wc->wr_id = cookie;
wc->qp = &qhp->ibqp;
wc->qp = qhp ? &qhp->ibqp : NULL;
wc->vendor_err = CQE_STATUS(cqe);
wc->wc_flags = 0;
@ -182,8 +160,38 @@ static int iwch_poll_cq_one(struct iwch_dev *rhp, struct iwch_cq *chp,
}
}
out:
if (wq)
return ret;
}
/*
* Get one cq entry from cxio and map it to openib.
*
* Returns:
* 0 EMPTY;
* 1 cqe returned
* -EAGAIN caller must try again
* any other -errno fatal error
*/
static int iwch_poll_cq_one(struct iwch_dev *rhp, struct iwch_cq *chp,
struct ib_wc *wc)
{
struct iwch_qp *qhp;
struct t3_cqe *rd_cqe;
int ret;
rd_cqe = cxio_next_cqe(&chp->cq);
if (!rd_cqe)
return 0;
qhp = get_qhp(rhp, CQE_QPID(*rd_cqe));
if (qhp) {
spin_lock(&qhp->lock);
ret = __iwch_poll_cq_one(rhp, chp, qhp, wc);
spin_unlock(&qhp->lock);
} else {
ret = __iwch_poll_cq_one(rhp, chp, NULL, wc);
}
return ret;
}

View File

@ -61,42 +61,6 @@
#include <rdma/cxgb3-abi.h>
#include "common.h"
static struct ib_ah *iwch_ah_create(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
struct ib_udata *udata)
{
return ERR_PTR(-ENOSYS);
}
static int iwch_ah_destroy(struct ib_ah *ah)
{
return -ENOSYS;
}
static int iwch_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
{
return -ENOSYS;
}
static int iwch_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
{
return -ENOSYS;
}
static int iwch_process_mad(struct ib_device *ibdev,
int mad_flags,
u8 port_num,
const struct ib_wc *in_wc,
const struct ib_grh *in_grh,
const struct ib_mad_hdr *in_mad,
size_t in_mad_size,
struct ib_mad_hdr *out_mad,
size_t *out_mad_size,
u16 *out_mad_pkey_index)
{
return -ENOSYS;
}
static int iwch_dealloc_ucontext(struct ib_ucontext *context)
{
struct iwch_dev *rhp = to_iwch_dev(context->device);
@ -1103,7 +1067,8 @@ static int iwch_query_device(struct ib_device *ibdev, struct ib_device_attr *pro
props->max_mr_size = dev->attr.max_mr_size;
props->max_qp = dev->attr.max_qps;
props->max_qp_wr = dev->attr.max_wrs;
props->max_sge = dev->attr.max_sge_per_wr;
props->max_send_sge = dev->attr.max_sge_per_wr;
props->max_recv_sge = dev->attr.max_sge_per_wr;
props->max_sge_rd = 1;
props->max_qp_rd_atom = dev->attr.max_rdma_reads_per_qp;
props->max_qp_init_rd_atom = dev->attr.max_rdma_reads_per_qp;
@ -1398,8 +1363,6 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.mmap = iwch_mmap;
dev->ibdev.alloc_pd = iwch_allocate_pd;
dev->ibdev.dealloc_pd = iwch_deallocate_pd;
dev->ibdev.create_ah = iwch_ah_create;
dev->ibdev.destroy_ah = iwch_ah_destroy;
dev->ibdev.create_qp = iwch_create_qp;
dev->ibdev.modify_qp = iwch_ib_modify_qp;
dev->ibdev.destroy_qp = iwch_destroy_qp;
@ -1414,9 +1377,6 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dealloc_mw = iwch_dealloc_mw;
dev->ibdev.alloc_mr = iwch_alloc_mr;
dev->ibdev.map_mr_sg = iwch_map_mr_sg;
dev->ibdev.attach_mcast = iwch_multicast_attach;
dev->ibdev.detach_mcast = iwch_multicast_detach;
dev->ibdev.process_mad = iwch_process_mad;
dev->ibdev.req_notify_cq = iwch_arm_cq;
dev->ibdev.post_send = iwch_post_send;
dev->ibdev.post_recv = iwch_post_receive;

View File

@ -326,10 +326,10 @@ enum iwch_qp_query_flags {
};
u16 iwch_rqes_posted(struct iwch_qp *qhp);
int iwch_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr);
int iwch_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr);
int iwch_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr);
int iwch_post_receive(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr);
int iwch_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int iwch_post_terminate(struct iwch_qp *qhp, struct respQ_msg_t *rsp_msg);
int iwch_post_zb_read(struct iwch_ep *ep);

View File

@ -39,8 +39,8 @@
#define NO_SUPPORT -1
static int build_rdma_send(union t3_wr *wqe, struct ib_send_wr *wr,
u8 * flit_cnt)
static int build_rdma_send(union t3_wr *wqe, const struct ib_send_wr *wr,
u8 *flit_cnt)
{
int i;
u32 plen;
@ -84,7 +84,7 @@ static int build_rdma_send(union t3_wr *wqe, struct ib_send_wr *wr,
return 0;
}
static int build_rdma_write(union t3_wr *wqe, struct ib_send_wr *wr,
static int build_rdma_write(union t3_wr *wqe, const struct ib_send_wr *wr,
u8 *flit_cnt)
{
int i;
@ -125,7 +125,7 @@ static int build_rdma_write(union t3_wr *wqe, struct ib_send_wr *wr,
return 0;
}
static int build_rdma_read(union t3_wr *wqe, struct ib_send_wr *wr,
static int build_rdma_read(union t3_wr *wqe, const struct ib_send_wr *wr,
u8 *flit_cnt)
{
if (wr->num_sge > 1)
@ -146,7 +146,7 @@ static int build_rdma_read(union t3_wr *wqe, struct ib_send_wr *wr,
return 0;
}
static int build_memreg(union t3_wr *wqe, struct ib_reg_wr *wr,
static int build_memreg(union t3_wr *wqe, const struct ib_reg_wr *wr,
u8 *flit_cnt, int *wr_cnt, struct t3_wq *wq)
{
struct iwch_mr *mhp = to_iwch_mr(wr->mr);
@ -189,7 +189,7 @@ static int build_memreg(union t3_wr *wqe, struct ib_reg_wr *wr,
return 0;
}
static int build_inv_stag(union t3_wr *wqe, struct ib_send_wr *wr,
static int build_inv_stag(union t3_wr *wqe, const struct ib_send_wr *wr,
u8 *flit_cnt)
{
wqe->local_inv.stag = cpu_to_be32(wr->ex.invalidate_rkey);
@ -246,7 +246,7 @@ static int iwch_sgl2pbl_map(struct iwch_dev *rhp, struct ib_sge *sg_list,
}
static int build_rdma_recv(struct iwch_qp *qhp, union t3_wr *wqe,
struct ib_recv_wr *wr)
const struct ib_recv_wr *wr)
{
int i, err = 0;
u32 pbl_addr[T3_MAX_SGE];
@ -286,7 +286,7 @@ static int build_rdma_recv(struct iwch_qp *qhp, union t3_wr *wqe,
}
static int build_zero_stag_recv(struct iwch_qp *qhp, union t3_wr *wqe,
struct ib_recv_wr *wr)
const struct ib_recv_wr *wr)
{
int i;
u32 pbl_addr;
@ -348,8 +348,8 @@ static int build_zero_stag_recv(struct iwch_qp *qhp, union t3_wr *wqe,
return 0;
}
int iwch_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr)
int iwch_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr)
{
int err = 0;
u8 uninitialized_var(t3_wr_flit_cnt);
@ -463,8 +463,8 @@ out:
return err;
}
int iwch_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
int iwch_post_receive(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
int err = 0;
struct iwch_qp *qhp;

View File

@ -587,24 +587,29 @@ static int send_flowc(struct c4iw_ep *ep)
{
struct fw_flowc_wr *flowc;
struct sk_buff *skb = skb_dequeue(&ep->com.ep_skb_list);
int i;
u16 vlan = ep->l2t->vlan;
int nparams;
int flowclen, flowclen16;
if (WARN_ON(!skb))
return -ENOMEM;
if (vlan == CPL_L2T_VLAN_NONE)
nparams = 8;
else
nparams = 9;
else
nparams = 10;
flowc = __skb_put(skb, FLOWC_LEN);
flowclen = offsetof(struct fw_flowc_wr, mnemval[nparams]);
flowclen16 = DIV_ROUND_UP(flowclen, 16);
flowclen = flowclen16 * 16;
flowc = __skb_put(skb, flowclen);
memset(flowc, 0, flowclen);
flowc->op_to_nparams = cpu_to_be32(FW_WR_OP_V(FW_FLOWC_WR) |
FW_FLOWC_WR_NPARAMS_V(nparams));
flowc->flowid_len16 = cpu_to_be32(FW_WR_LEN16_V(DIV_ROUND_UP(FLOWC_LEN,
16)) | FW_WR_FLOWID_V(ep->hwtid));
flowc->flowid_len16 = cpu_to_be32(FW_WR_LEN16_V(flowclen16) |
FW_WR_FLOWID_V(ep->hwtid));
flowc->mnemval[0].mnemonic = FW_FLOWC_MNEM_PFNVFN;
flowc->mnemval[0].val = cpu_to_be32(FW_PFVF_CMD_PFN_V
@ -623,21 +628,13 @@ static int send_flowc(struct c4iw_ep *ep)
flowc->mnemval[6].val = cpu_to_be32(ep->snd_win);
flowc->mnemval[7].mnemonic = FW_FLOWC_MNEM_MSS;
flowc->mnemval[7].val = cpu_to_be32(ep->emss);
if (nparams == 9) {
flowc->mnemval[8].mnemonic = FW_FLOWC_MNEM_RCV_SCALE;
flowc->mnemval[8].val = cpu_to_be32(ep->snd_wscale);
if (nparams == 10) {
u16 pri;
pri = (vlan & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
flowc->mnemval[8].mnemonic = FW_FLOWC_MNEM_SCHEDCLASS;
flowc->mnemval[8].val = cpu_to_be32(pri);
} else {
/* Pad WR to 16 byte boundary */
flowc->mnemval[8].mnemonic = 0;
flowc->mnemval[8].val = 0;
}
for (i = 0; i < 9; i++) {
flowc->mnemval[i].r4[0] = 0;
flowc->mnemval[i].r4[1] = 0;
flowc->mnemval[i].r4[2] = 0;
flowc->mnemval[9].mnemonic = FW_FLOWC_MNEM_SCHEDCLASS;
flowc->mnemval[9].val = cpu_to_be32(pri);
}
set_wr_txq(skb, CPL_PRIORITY_DATA, ep->txq_idx);
@ -1176,6 +1173,7 @@ static int act_establish(struct c4iw_dev *dev, struct sk_buff *skb)
{
struct c4iw_ep *ep;
struct cpl_act_establish *req = cplhdr(skb);
unsigned short tcp_opt = ntohs(req->tcp_opt);
unsigned int tid = GET_TID(req);
unsigned int atid = TID_TID_G(ntohl(req->tos_atid));
struct tid_info *t = dev->rdev.lldi.tids;
@ -1196,8 +1194,9 @@ static int act_establish(struct c4iw_dev *dev, struct sk_buff *skb)
ep->snd_seq = be32_to_cpu(req->snd_isn);
ep->rcv_seq = be32_to_cpu(req->rcv_isn);
ep->snd_wscale = TCPOPT_SND_WSCALE_G(tcp_opt);
set_emss(ep, ntohs(req->tcp_opt));
set_emss(ep, tcp_opt);
/* dealloc the atid */
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, atid);
@ -1853,10 +1852,33 @@ static int rx_data(struct c4iw_dev *dev, struct sk_buff *skb)
return 0;
}
static void complete_cached_srq_buffers(struct c4iw_ep *ep,
__be32 srqidx_status)
{
enum chip_type adapter_type;
u32 srqidx;
adapter_type = ep->com.dev->rdev.lldi.adapter_type;
srqidx = ABORT_RSS_SRQIDX_G(be32_to_cpu(srqidx_status));
/*
* If this TCB had a srq buffer cached, then we must complete
* it. For user mode, that means saving the srqidx in the
* user/kernel status page for this qp. For kernel mode, just
* synthesize the CQE now.
*/
if (CHELSIO_CHIP_VERSION(adapter_type) > CHELSIO_T5 && srqidx) {
if (ep->com.qp->ibqp.uobject)
t4_set_wq_in_error(&ep->com.qp->wq, srqidx);
else
c4iw_flush_srqidx(ep->com.qp, srqidx);
}
}
static int abort_rpl(struct c4iw_dev *dev, struct sk_buff *skb)
{
struct c4iw_ep *ep;
struct cpl_abort_rpl_rss *rpl = cplhdr(skb);
struct cpl_abort_rpl_rss6 *rpl = cplhdr(skb);
int release = 0;
unsigned int tid = GET_TID(rpl);
@ -1865,6 +1887,9 @@ static int abort_rpl(struct c4iw_dev *dev, struct sk_buff *skb)
pr_warn("Abort rpl to freed endpoint\n");
return 0;
}
complete_cached_srq_buffers(ep, rpl->srqidx_status);
pr_debug("ep %p tid %u\n", ep, ep->hwtid);
mutex_lock(&ep->com.mutex);
switch (ep->com.state) {
@ -2603,16 +2628,17 @@ static int pass_establish(struct c4iw_dev *dev, struct sk_buff *skb)
struct cpl_pass_establish *req = cplhdr(skb);
unsigned int tid = GET_TID(req);
int ret;
u16 tcp_opt = ntohs(req->tcp_opt);
ep = get_ep_from_tid(dev, tid);
pr_debug("ep %p tid %u\n", ep, ep->hwtid);
ep->snd_seq = be32_to_cpu(req->snd_isn);
ep->rcv_seq = be32_to_cpu(req->rcv_isn);
ep->snd_wscale = TCPOPT_SND_WSCALE_G(tcp_opt);
pr_debug("ep %p hwtid %u tcp_opt 0x%02x\n", ep, tid,
ntohs(req->tcp_opt));
pr_debug("ep %p hwtid %u tcp_opt 0x%02x\n", ep, tid, tcp_opt);
set_emss(ep, ntohs(req->tcp_opt));
set_emss(ep, tcp_opt);
dst_confirm(ep->dst);
mutex_lock(&ep->com.mutex);
@ -2719,28 +2745,35 @@ static int peer_close(struct c4iw_dev *dev, struct sk_buff *skb)
static int peer_abort(struct c4iw_dev *dev, struct sk_buff *skb)
{
struct cpl_abort_req_rss *req = cplhdr(skb);
struct cpl_abort_req_rss6 *req = cplhdr(skb);
struct c4iw_ep *ep;
struct sk_buff *rpl_skb;
struct c4iw_qp_attributes attrs;
int ret;
int release = 0;
unsigned int tid = GET_TID(req);
u8 status;
u32 len = roundup(sizeof(struct cpl_abort_rpl), 16);
ep = get_ep_from_tid(dev, tid);
if (!ep)
return 0;
if (cxgb_is_neg_adv(req->status)) {
status = ABORT_RSS_STATUS_G(be32_to_cpu(req->srqidx_status));
if (cxgb_is_neg_adv(status)) {
pr_debug("Negative advice on abort- tid %u status %d (%s)\n",
ep->hwtid, req->status, neg_adv_str(req->status));
ep->hwtid, status, neg_adv_str(status));
ep->stats.abort_neg_adv++;
mutex_lock(&dev->rdev.stats.lock);
dev->rdev.stats.neg_adv++;
mutex_unlock(&dev->rdev.stats.lock);
goto deref_ep;
}
complete_cached_srq_buffers(ep, req->srqidx_status);
pr_debug("ep %p tid %u state %u\n", ep, ep->hwtid,
ep->com.state);
set_bit(PEER_ABORT, &ep->com.history);
@ -3444,9 +3477,6 @@ int c4iw_create_listen(struct iw_cm_id *cm_id, int backlog)
}
insert_handle(dev, &dev->stid_idr, ep, ep->stid);
memcpy(&ep->com.local_addr, &cm_id->m_local_addr,
sizeof(ep->com.local_addr));
state_set(&ep->com, LISTEN);
if (ep->com.local_addr.ss_family == AF_INET)
err = create_server4(dev, ep);

View File

@ -77,6 +77,10 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
int user = (uctx != &rdev->uctx);
int ret;
struct sk_buff *skb;
struct c4iw_ucontext *ucontext = NULL;
if (user)
ucontext = container_of(uctx, struct c4iw_ucontext, uctx);
cq->cqid = c4iw_get_cqid(rdev, uctx);
if (!cq->cqid) {
@ -100,6 +104,16 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
dma_unmap_addr_set(cq, mapping, cq->dma_addr);
memset(cq->queue, 0, cq->memsize);
if (user && ucontext->is_32b_cqe) {
cq->qp_errp = &((struct t4_status_page *)
((u8 *)cq->queue + (cq->size - 1) *
(sizeof(*cq->queue) / 2)))->qp_err;
} else {
cq->qp_errp = &((struct t4_status_page *)
((u8 *)cq->queue + (cq->size - 1) *
sizeof(*cq->queue)))->qp_err;
}
/* build fw_ri_res_wr */
wr_len = sizeof *res_wr + sizeof *res;
@ -132,7 +146,9 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
FW_RI_RES_WR_IQPCIECH_V(2) |
FW_RI_RES_WR_IQINTCNTTHRESH_V(0) |
FW_RI_RES_WR_IQO_F |
FW_RI_RES_WR_IQESIZE_V(1));
((user && ucontext->is_32b_cqe) ?
FW_RI_RES_WR_IQESIZE_V(1) :
FW_RI_RES_WR_IQESIZE_V(2)));
res->u.cq.iqsize = cpu_to_be16(cq->size);
res->u.cq.iqaddr = cpu_to_be64(cq->dma_addr);
@ -166,7 +182,7 @@ err1:
return ret;
}
static void insert_recv_cqe(struct t4_wq *wq, struct t4_cq *cq)
static void insert_recv_cqe(struct t4_wq *wq, struct t4_cq *cq, u32 srqidx)
{
struct t4_cqe cqe;
@ -179,6 +195,8 @@ static void insert_recv_cqe(struct t4_wq *wq, struct t4_cq *cq)
CQE_SWCQE_V(1) |
CQE_QPID_V(wq->sq.qid));
cqe.bits_type_ts = cpu_to_be64(CQE_GENBIT_V((u64)cq->gen));
if (srqidx)
cqe.u.srcqe.abs_rqe_idx = cpu_to_be32(srqidx);
cq->sw_queue[cq->sw_pidx] = cqe;
t4_swcq_produce(cq);
}
@ -191,7 +209,7 @@ int c4iw_flush_rq(struct t4_wq *wq, struct t4_cq *cq, int count)
pr_debug("wq %p cq %p rq.in_use %u skip count %u\n",
wq, cq, wq->rq.in_use, count);
while (in_use--) {
insert_recv_cqe(wq, cq);
insert_recv_cqe(wq, cq, 0);
flushed++;
}
return flushed;
@ -442,6 +460,72 @@ void c4iw_count_rcqes(struct t4_cq *cq, struct t4_wq *wq, int *count)
pr_debug("cq %p count %d\n", cq, *count);
}
static void post_pending_srq_wrs(struct t4_srq *srq)
{
struct t4_srq_pending_wr *pwr;
u16 idx = 0;
while (srq->pending_in_use) {
pwr = &srq->pending_wrs[srq->pending_cidx];
srq->sw_rq[srq->pidx].wr_id = pwr->wr_id;
srq->sw_rq[srq->pidx].valid = 1;
pr_debug("%s posting pending cidx %u pidx %u wq_pidx %u in_use %u rq_size %u wr_id %llx\n",
__func__,
srq->cidx, srq->pidx, srq->wq_pidx,
srq->in_use, srq->size,
(unsigned long long)pwr->wr_id);
c4iw_copy_wr_to_srq(srq, &pwr->wqe, pwr->len16);
t4_srq_consume_pending_wr(srq);
t4_srq_produce(srq, pwr->len16);
idx += DIV_ROUND_UP(pwr->len16 * 16, T4_EQ_ENTRY_SIZE);
}
if (idx) {
t4_ring_srq_db(srq, idx, pwr->len16, &pwr->wqe);
srq->queue[srq->size].status.host_wq_pidx =
srq->wq_pidx;
}
}
static u64 reap_srq_cqe(struct t4_cqe *hw_cqe, struct t4_srq *srq)
{
int rel_idx = CQE_ABS_RQE_IDX(hw_cqe) - srq->rqt_abs_idx;
u64 wr_id;
srq->sw_rq[rel_idx].valid = 0;
wr_id = srq->sw_rq[rel_idx].wr_id;
if (rel_idx == srq->cidx) {
pr_debug("%s in order cqe rel_idx %u cidx %u pidx %u wq_pidx %u in_use %u rq_size %u wr_id %llx\n",
__func__, rel_idx, srq->cidx, srq->pidx,
srq->wq_pidx, srq->in_use, srq->size,
(unsigned long long)srq->sw_rq[rel_idx].wr_id);
t4_srq_consume(srq);
while (srq->ooo_count && !srq->sw_rq[srq->cidx].valid) {
pr_debug("%s eat ooo cidx %u pidx %u wq_pidx %u in_use %u rq_size %u ooo_count %u wr_id %llx\n",
__func__, srq->cidx, srq->pidx,
srq->wq_pidx, srq->in_use,
srq->size, srq->ooo_count,
(unsigned long long)
srq->sw_rq[srq->cidx].wr_id);
t4_srq_consume_ooo(srq);
}
if (srq->ooo_count == 0 && srq->pending_in_use)
post_pending_srq_wrs(srq);
} else {
pr_debug("%s ooo cqe rel_idx %u cidx %u pidx %u wq_pidx %u in_use %u rq_size %u ooo_count %u wr_id %llx\n",
__func__, rel_idx, srq->cidx,
srq->pidx, srq->wq_pidx,
srq->in_use, srq->size,
srq->ooo_count,
(unsigned long long)srq->sw_rq[rel_idx].wr_id);
t4_srq_produce_ooo(srq);
}
return wr_id;
}
/*
* poll_cq
*
@ -459,7 +543,8 @@ void c4iw_count_rcqes(struct t4_cq *cq, struct t4_wq *wq, int *count)
* -EOVERFLOW CQ overflow detected.
*/
static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
u8 *cqe_flushed, u64 *cookie, u32 *credit)
u8 *cqe_flushed, u64 *cookie, u32 *credit,
struct t4_srq *srq)
{
int ret = 0;
struct t4_cqe *hw_cqe, read_cqe;
@ -524,7 +609,7 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
*/
if (CQE_TYPE(hw_cqe) == 1) {
if (CQE_STATUS(hw_cqe))
t4_set_wq_in_error(wq);
t4_set_wq_in_error(wq, 0);
ret = -EAGAIN;
goto skip_cqe;
}
@ -535,7 +620,7 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
*/
if (CQE_WRID_STAG(hw_cqe) == 1) {
if (CQE_STATUS(hw_cqe))
t4_set_wq_in_error(wq);
t4_set_wq_in_error(wq, 0);
ret = -EAGAIN;
goto skip_cqe;
}
@ -560,7 +645,7 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
if (CQE_STATUS(hw_cqe) || t4_wq_in_error(wq)) {
*cqe_flushed = (CQE_STATUS(hw_cqe) == T4_ERR_SWFLUSH);
t4_set_wq_in_error(wq);
t4_set_wq_in_error(wq, 0);
}
/*
@ -574,15 +659,9 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
* then we complete this with T4_ERR_MSN and mark the wq in
* error.
*/
if (t4_rq_empty(wq)) {
t4_set_wq_in_error(wq);
ret = -EAGAIN;
goto skip_cqe;
}
if (unlikely(!CQE_STATUS(hw_cqe) &&
CQE_WRID_MSN(hw_cqe) != wq->rq.msn)) {
t4_set_wq_in_error(wq);
t4_set_wq_in_error(wq, 0);
hw_cqe->header |= cpu_to_be32(CQE_STATUS_V(T4_ERR_MSN));
}
goto proc_cqe;
@ -641,11 +720,16 @@ proc_cqe:
c4iw_log_wr_stats(wq, hw_cqe);
t4_sq_consume(wq);
} else {
if (!srq) {
pr_debug("completing rq idx %u\n", wq->rq.cidx);
*cookie = wq->rq.sw_rq[wq->rq.cidx].wr_id;
if (c4iw_wr_log)
c4iw_log_wr_stats(wq, hw_cqe);
t4_rq_consume(wq);
} else {
*cookie = reap_srq_cqe(hw_cqe, srq);
}
wq->rq.msn++;
goto skip_cqe;
}
@ -668,46 +752,33 @@ skip_cqe:
return ret;
}
/*
* Get one cq entry from c4iw and map it to openib.
*
* Returns:
* 0 cqe returned
* -ENODATA EMPTY;
* -EAGAIN caller must try again
* any other -errno fatal error
*/
static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
static int __c4iw_poll_cq_one(struct c4iw_cq *chp, struct c4iw_qp *qhp,
struct ib_wc *wc, struct c4iw_srq *srq)
{
struct c4iw_qp *qhp = NULL;
struct t4_cqe uninitialized_var(cqe), *rd_cqe;
struct t4_wq *wq;
struct t4_cqe uninitialized_var(cqe);
struct t4_wq *wq = qhp ? &qhp->wq : NULL;
u32 credit = 0;
u8 cqe_flushed;
u64 cookie = 0;
int ret;
ret = t4_next_cqe(&chp->cq, &rd_cqe);
if (ret)
return ret;
qhp = get_qhp(chp->rhp, CQE_QPID(rd_cqe));
if (!qhp)
wq = NULL;
else {
spin_lock(&qhp->lock);
wq = &(qhp->wq);
}
ret = poll_cq(wq, &(chp->cq), &cqe, &cqe_flushed, &cookie, &credit);
ret = poll_cq(wq, &(chp->cq), &cqe, &cqe_flushed, &cookie, &credit,
srq ? &srq->wq : NULL);
if (ret)
goto out;
wc->wr_id = cookie;
wc->qp = &qhp->ibqp;
wc->qp = qhp ? &qhp->ibqp : NULL;
wc->vendor_err = CQE_STATUS(&cqe);
wc->wc_flags = 0;
/*
* Simulate a SRQ_LIMIT_REACHED HW notification if required.
*/
if (srq && !(srq->flags & T4_SRQ_LIMIT_SUPPORT) && srq->armed &&
srq->wq.in_use < srq->srq_limit)
c4iw_dispatch_srq_limit_reached_event(srq);
pr_debug("qpid 0x%x type %d opcode %d status 0x%x len %u wrid hi 0x%x lo 0x%x cookie 0x%llx\n",
CQE_QPID(&cqe),
CQE_TYPE(&cqe), CQE_OPCODE(&cqe),
@ -720,15 +791,32 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
wc->byte_len = CQE_LEN(&cqe);
else
wc->byte_len = 0;
switch (CQE_OPCODE(&cqe)) {
case FW_RI_SEND:
wc->opcode = IB_WC_RECV;
break;
case FW_RI_SEND_WITH_INV:
case FW_RI_SEND_WITH_SE_INV:
wc->opcode = IB_WC_RECV;
if (CQE_OPCODE(&cqe) == FW_RI_SEND_WITH_INV ||
CQE_OPCODE(&cqe) == FW_RI_SEND_WITH_SE_INV) {
wc->ex.invalidate_rkey = CQE_WRID_STAG(&cqe);
wc->wc_flags |= IB_WC_WITH_INVALIDATE;
c4iw_invalidate_mr(qhp->rhp, wc->ex.invalidate_rkey);
break;
case FW_RI_WRITE_IMMEDIATE:
wc->opcode = IB_WC_RECV_RDMA_WITH_IMM;
wc->ex.imm_data = CQE_IMM_DATA(&cqe);
wc->wc_flags |= IB_WC_WITH_IMM;
break;
default:
pr_err("Unexpected opcode %d in the CQE received for QPID=0x%0x\n",
CQE_OPCODE(&cqe), CQE_QPID(&cqe));
ret = -EINVAL;
goto out;
}
} else {
switch (CQE_OPCODE(&cqe)) {
case FW_RI_WRITE_IMMEDIATE:
case FW_RI_RDMA_WRITE:
wc->opcode = IB_WC_RDMA_WRITE;
break;
@ -819,8 +907,43 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
}
}
out:
if (wq)
return ret;
}
/*
* Get one cq entry from c4iw and map it to openib.
*
* Returns:
* 0 cqe returned
* -ENODATA EMPTY;
* -EAGAIN caller must try again
* any other -errno fatal error
*/
static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
{
struct c4iw_srq *srq = NULL;
struct c4iw_qp *qhp = NULL;
struct t4_cqe *rd_cqe;
int ret;
ret = t4_next_cqe(&chp->cq, &rd_cqe);
if (ret)
return ret;
qhp = get_qhp(chp->rhp, CQE_QPID(rd_cqe));
if (qhp) {
spin_lock(&qhp->lock);
srq = qhp->srq;
if (srq)
spin_lock(&srq->lock);
ret = __c4iw_poll_cq_one(chp, qhp, wc, srq);
spin_unlock(&qhp->lock);
if (srq)
spin_unlock(&srq->lock);
} else {
ret = __c4iw_poll_cq_one(chp, NULL, wc, NULL);
}
return ret;
}
@ -876,6 +999,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
int vector = attr->comp_vector;
struct c4iw_dev *rhp;
struct c4iw_cq *chp;
struct c4iw_create_cq ucmd;
struct c4iw_create_cq_resp uresp;
struct c4iw_ucontext *ucontext = NULL;
int ret, wr_len;
@ -891,9 +1015,16 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
if (vector >= rhp->rdev.lldi.nciq)
return ERR_PTR(-EINVAL);
if (ib_context) {
ucontext = to_c4iw_ucontext(ib_context);
if (udata->inlen < sizeof(ucmd))
ucontext->is_32b_cqe = 1;
}
chp = kzalloc(sizeof(*chp), GFP_KERNEL);
if (!chp)
return ERR_PTR(-ENOMEM);
chp->wr_waitp = c4iw_alloc_wr_wait(GFP_KERNEL);
if (!chp->wr_waitp) {
ret = -ENOMEM;
@ -908,9 +1039,6 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
goto err_free_wr_wait;
}
if (ib_context)
ucontext = to_c4iw_ucontext(ib_context);
/* account for the status page. */
entries++;
@ -934,13 +1062,15 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
if (hwentries < 64)
hwentries = 64;
memsize = hwentries * sizeof *chp->cq.queue;
memsize = hwentries * ((ucontext && ucontext->is_32b_cqe) ?
(sizeof(*chp->cq.queue) / 2) : sizeof(*chp->cq.queue));
/*
* memsize must be a multiple of the page size if its a user cq.
*/
if (ucontext)
memsize = roundup(memsize, PAGE_SIZE);
chp->cq.size = hwentries;
chp->cq.memsize = memsize;
chp->cq.vector = vector;
@ -971,6 +1101,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
if (!mm2)
goto err_free_mm;
memset(&uresp, 0, sizeof(uresp));
uresp.qid_mask = rhp->rdev.cqmask;
uresp.cqid = chp->cq.cqid;
uresp.size = chp->cq.size;
@ -980,9 +1111,16 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
ucontext->key += PAGE_SIZE;
uresp.gts_key = ucontext->key;
ucontext->key += PAGE_SIZE;
/* communicate to the userspace that
* kernel driver supports 64B CQE
*/
uresp.flags |= C4IW_64B_CQE;
spin_unlock(&ucontext->mmap_lock);
ret = ib_copy_to_udata(udata, &uresp,
sizeof(uresp) - sizeof(uresp.reserved));
ucontext->is_32b_cqe ?
sizeof(uresp) - sizeof(uresp.flags) :
sizeof(uresp));
if (ret)
goto err_free_mm2;
@ -1019,11 +1157,6 @@ err_free_chp:
return ERR_PTR(ret);
}
int c4iw_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata)
{
return -ENOSYS;
}
int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
{
struct c4iw_cq *chp;
@ -1039,3 +1172,19 @@ int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
spin_unlock_irqrestore(&chp->lock, flag);
return ret;
}
void c4iw_flush_srqidx(struct c4iw_qp *qhp, u32 srqidx)
{
struct c4iw_cq *rchp = to_c4iw_cq(qhp->ibqp.recv_cq);
unsigned long flag;
/* locking heirarchy: cq lock first, then qp lock. */
spin_lock_irqsave(&rchp->lock, flag);
spin_lock(&qhp->lock);
/* create a SRQ RECV CQE for srqidx */
insert_recv_cqe(&qhp->wq, &rchp->cq, srqidx);
spin_unlock(&qhp->lock);
spin_unlock_irqrestore(&rchp->lock, flag);
}

View File

@ -275,10 +275,11 @@ static int dump_qp(int id, void *p, void *data)
set_ep_sin_addrs(ep, &lsin, &rsin, &m_lsin, &m_rsin);
cc = snprintf(qpd->buf + qpd->pos, space,
"rc qp sq id %u rq id %u state %u "
"rc qp sq id %u %s id %u state %u "
"onchip %u ep tid %u state %u "
"%pI4:%u/%u->%pI4:%u/%u\n",
qp->wq.sq.qid, qp->wq.rq.qid,
qp->wq.sq.qid, qp->srq ? "srq" : "rq",
qp->srq ? qp->srq->idx : qp->wq.rq.qid,
(int)qp->attr.state,
qp->wq.sq.flags & T4_SQ_ONCHIP,
ep->hwtid, (int)ep->com.state,
@ -480,6 +481,9 @@ static int stats_show(struct seq_file *seq, void *v)
seq_printf(seq, " QID: %10llu %10llu %10llu %10llu\n",
dev->rdev.stats.qid.total, dev->rdev.stats.qid.cur,
dev->rdev.stats.qid.max, dev->rdev.stats.qid.fail);
seq_printf(seq, " SRQS: %10llu %10llu %10llu %10llu\n",
dev->rdev.stats.srqt.total, dev->rdev.stats.srqt.cur,
dev->rdev.stats.srqt.max, dev->rdev.stats.srqt.fail);
seq_printf(seq, " TPTMEM: %10llu %10llu %10llu %10llu\n",
dev->rdev.stats.stag.total, dev->rdev.stats.stag.cur,
dev->rdev.stats.stag.max, dev->rdev.stats.stag.fail);
@ -530,6 +534,8 @@ static ssize_t stats_clear(struct file *file, const char __user *buf,
dev->rdev.stats.pbl.fail = 0;
dev->rdev.stats.rqt.max = 0;
dev->rdev.stats.rqt.fail = 0;
dev->rdev.stats.rqt.max = 0;
dev->rdev.stats.rqt.fail = 0;
dev->rdev.stats.ocqp.max = 0;
dev->rdev.stats.ocqp.fail = 0;
dev->rdev.stats.db_full = 0;
@ -802,7 +808,7 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev->qpmask = rdev->lldi.udb_density - 1;
rdev->cqmask = rdev->lldi.ucq_density - 1;
pr_debug("dev %s stag start 0x%0x size 0x%0x num stags %d pbl start 0x%0x size 0x%0x rq start 0x%0x size 0x%0x qp qid start %u size %u cq qid start %u size %u\n",
pr_debug("dev %s stag start 0x%0x size 0x%0x num stags %d pbl start 0x%0x size 0x%0x rq start 0x%0x size 0x%0x qp qid start %u size %u cq qid start %u size %u srq size %u\n",
pci_name(rdev->lldi.pdev), rdev->lldi.vr->stag.start,
rdev->lldi.vr->stag.size, c4iw_num_stags(rdev),
rdev->lldi.vr->pbl.start,
@ -811,7 +817,8 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev->lldi.vr->qp.start,
rdev->lldi.vr->qp.size,
rdev->lldi.vr->cq.start,
rdev->lldi.vr->cq.size);
rdev->lldi.vr->cq.size,
rdev->lldi.vr->srq.size);
pr_debug("udb %pR db_reg %p gts_reg %p qpmask 0x%x cqmask 0x%x\n",
&rdev->lldi.pdev->resource[2],
rdev->lldi.db_reg, rdev->lldi.gts_reg,
@ -824,10 +831,12 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev->stats.stag.total = rdev->lldi.vr->stag.size;
rdev->stats.pbl.total = rdev->lldi.vr->pbl.size;
rdev->stats.rqt.total = rdev->lldi.vr->rq.size;
rdev->stats.srqt.total = rdev->lldi.vr->srq.size;
rdev->stats.ocqp.total = rdev->lldi.vr->ocq.size;
rdev->stats.qid.total = rdev->lldi.vr->qp.size;
err = c4iw_init_resource(rdev, c4iw_num_stags(rdev), T4_MAX_NUM_PD);
err = c4iw_init_resource(rdev, c4iw_num_stags(rdev),
T4_MAX_NUM_PD, rdev->lldi.vr->srq.size);
if (err) {
pr_err("error %d initializing resources\n", err);
return err;
@ -857,6 +866,7 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev->status_page->qp_size = rdev->lldi.vr->qp.size;
rdev->status_page->cq_start = rdev->lldi.vr->cq.start;
rdev->status_page->cq_size = rdev->lldi.vr->cq.size;
rdev->status_page->write_cmpl_supported = rdev->lldi.write_cmpl_support;
if (c4iw_wr_log) {
rdev->wr_log = kcalloc(1 << c4iw_wr_log_size_order,

View File

@ -70,9 +70,10 @@ static void dump_err_cqe(struct c4iw_dev *dev, struct t4_cqe *err_cqe)
CQE_STATUS(err_cqe), CQE_TYPE(err_cqe), ntohl(err_cqe->len),
CQE_WRID_HI(err_cqe), CQE_WRID_LOW(err_cqe));
pr_debug("%016llx %016llx %016llx %016llx\n",
pr_debug("%016llx %016llx %016llx %016llx - %016llx %016llx %016llx %016llx\n",
be64_to_cpu(p[0]), be64_to_cpu(p[1]), be64_to_cpu(p[2]),
be64_to_cpu(p[3]));
be64_to_cpu(p[3]), be64_to_cpu(p[4]), be64_to_cpu(p[5]),
be64_to_cpu(p[6]), be64_to_cpu(p[7]));
/*
* Ingress WRITE and READ_RESP errors provide

View File

@ -97,6 +97,7 @@ struct c4iw_resource {
struct c4iw_id_table tpt_table;
struct c4iw_id_table qid_table;
struct c4iw_id_table pdid_table;
struct c4iw_id_table srq_table;
};
struct c4iw_qid_list {
@ -130,6 +131,8 @@ struct c4iw_stats {
struct c4iw_stat stag;
struct c4iw_stat pbl;
struct c4iw_stat rqt;
struct c4iw_stat srqt;
struct c4iw_stat srq;
struct c4iw_stat ocqp;
u64 db_full;
u64 db_empty;
@ -549,6 +552,7 @@ struct c4iw_qp {
struct kref kref;
wait_queue_head_t wait;
int sq_sig_all;
struct c4iw_srq *srq;
struct work_struct free_work;
struct c4iw_ucontext *ucontext;
struct c4iw_wr_wait *wr_waitp;
@ -559,6 +563,26 @@ static inline struct c4iw_qp *to_c4iw_qp(struct ib_qp *ibqp)
return container_of(ibqp, struct c4iw_qp, ibqp);
}
struct c4iw_srq {
struct ib_srq ibsrq;
struct list_head db_fc_entry;
struct c4iw_dev *rhp;
struct t4_srq wq;
struct sk_buff *destroy_skb;
u32 srq_limit;
u32 pdid;
int idx;
u32 flags;
spinlock_t lock; /* protects srq */
struct c4iw_wr_wait *wr_waitp;
bool armed;
};
static inline struct c4iw_srq *to_c4iw_srq(struct ib_srq *ibsrq)
{
return container_of(ibsrq, struct c4iw_srq, ibsrq);
}
struct c4iw_ucontext {
struct ib_ucontext ibucontext;
struct c4iw_dev_ucontext uctx;
@ -566,6 +590,7 @@ struct c4iw_ucontext {
spinlock_t mmap_lock;
struct list_head mmaps;
struct kref kref;
bool is_32b_cqe;
};
static inline struct c4iw_ucontext *to_c4iw_ucontext(struct ib_ucontext *c)
@ -885,7 +910,10 @@ enum conn_pre_alloc_buffers {
CN_MAX_CON_BUF
};
#define FLOWC_LEN 80
enum {
FLOWC_LEN = offsetof(struct fw_flowc_wr, mnemval[FW_FLOWC_MNEM_MAX])
};
union cpl_wr_size {
struct cpl_abort_req abrt_req;
struct cpl_abort_rpl abrt_rpl;
@ -952,6 +980,7 @@ struct c4iw_ep {
unsigned int retry_count;
int snd_win;
int rcv_win;
u32 snd_wscale;
struct c4iw_ep_stats stats;
};
@ -988,7 +1017,8 @@ void c4iw_put_qpid(struct c4iw_rdev *rdev, u32 qpid,
struct c4iw_dev_ucontext *uctx);
u32 c4iw_get_resource(struct c4iw_id_table *id_table);
void c4iw_put_resource(struct c4iw_id_table *id_table, u32 entry);
int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid);
int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt,
u32 nr_pdid, u32 nr_srqt);
int c4iw_init_ctrl_qp(struct c4iw_rdev *rdev);
int c4iw_pblpool_create(struct c4iw_rdev *rdev);
int c4iw_rqtpool_create(struct c4iw_rdev *rdev);
@ -1007,10 +1037,10 @@ void c4iw_release_dev_ucontext(struct c4iw_rdev *rdev,
void c4iw_init_dev_ucontext(struct c4iw_rdev *rdev,
struct c4iw_dev_ucontext *uctx);
int c4iw_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr);
int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr);
int c4iw_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr);
int c4iw_post_receive(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr);
int c4iw_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param);
int c4iw_create_listen(struct iw_cm_id *cm_id, int backlog);
int c4iw_destroy_listen(struct iw_cm_id *cm_id);
@ -1037,8 +1067,14 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *ib_context,
struct ib_udata *udata);
int c4iw_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata);
int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata);
int c4iw_destroy_srq(struct ib_srq *ib_srq);
struct ib_srq *c4iw_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *attrs,
struct ib_udata *udata);
int c4iw_destroy_qp(struct ib_qp *ib_qp);
struct ib_qp *c4iw_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *attrs,
@ -1075,12 +1111,19 @@ extern c4iw_handler_func c4iw_handlers[NUM_CPL_CMDS];
void __iomem *c4iw_bar2_addrs(struct c4iw_rdev *rdev, unsigned int qid,
enum cxgb4_bar2_qtype qtype,
unsigned int *pbar2_qid, u64 *pbar2_pa);
int c4iw_alloc_srq_idx(struct c4iw_rdev *rdev);
void c4iw_free_srq_idx(struct c4iw_rdev *rdev, int idx);
extern void c4iw_log_wr_stats(struct t4_wq *wq, struct t4_cqe *cqe);
extern int c4iw_wr_log;
extern int db_fc_threshold;
extern int db_coalescing_threshold;
extern int use_dsgl;
void c4iw_invalidate_mr(struct c4iw_dev *rhp, u32 rkey);
void c4iw_dispatch_srq_limit_reached_event(struct c4iw_srq *srq);
void c4iw_copy_wr_to_srq(struct t4_srq *srq, union t4_recv_wr *wqe, u8 len16);
void c4iw_flush_srqidx(struct c4iw_qp *qhp, u32 srqidx);
int c4iw_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr);
struct c4iw_wr_wait *c4iw_alloc_wr_wait(gfp_t gfp);
typedef int c4iw_restrack_func(struct sk_buff *msg,

View File

@ -58,41 +58,6 @@ static int fastreg_support = 1;
module_param(fastreg_support, int, 0644);
MODULE_PARM_DESC(fastreg_support, "Advertise fastreg support (default=1)");
static struct ib_ah *c4iw_ah_create(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
struct ib_udata *udata)
{
return ERR_PTR(-ENOSYS);
}
static int c4iw_ah_destroy(struct ib_ah *ah)
{
return -ENOSYS;
}
static int c4iw_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
{
return -ENOSYS;
}
static int c4iw_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
{
return -ENOSYS;
}
static int c4iw_process_mad(struct ib_device *ibdev, int mad_flags,
u8 port_num, const struct ib_wc *in_wc,
const struct ib_grh *in_grh,
const struct ib_mad_hdr *in_mad,
size_t in_mad_size,
struct ib_mad_hdr *out_mad,
size_t *out_mad_size,
u16 *out_mad_pkey_index)
{
return -ENOSYS;
}
void _c4iw_free_ucontext(struct kref *kref)
{
struct c4iw_ucontext *ucontext;
@ -342,8 +307,12 @@ static int c4iw_query_device(struct ib_device *ibdev, struct ib_device_attr *pro
props->vendor_part_id = (u32)dev->rdev.lldi.pdev->device;
props->max_mr_size = T4_MAX_MR_SIZE;
props->max_qp = dev->rdev.lldi.vr->qp.size / 2;
props->max_srq = dev->rdev.lldi.vr->srq.size;
props->max_qp_wr = dev->rdev.hw_queue.t4_max_qp_depth;
props->max_sge = T4_MAX_RECV_SGE;
props->max_srq_wr = dev->rdev.hw_queue.t4_max_qp_depth;
props->max_send_sge = min(T4_MAX_SEND_SGE, T4_MAX_WRITE_SGE);
props->max_recv_sge = T4_MAX_RECV_SGE;
props->max_srq_sge = T4_MAX_RECV_SGE;
props->max_sge_rd = 1;
props->max_res_rd_atom = dev->rdev.lldi.max_ird_adapter;
props->max_qp_rd_atom = min(dev->rdev.lldi.max_ordird_qp,
@ -592,7 +561,10 @@ void c4iw_register_device(struct work_struct *work)
(1ull << IB_USER_VERBS_CMD_POLL_CQ) |
(1ull << IB_USER_VERBS_CMD_DESTROY_QP) |
(1ull << IB_USER_VERBS_CMD_POST_SEND) |
(1ull << IB_USER_VERBS_CMD_POST_RECV);
(1ull << IB_USER_VERBS_CMD_POST_RECV) |
(1ull << IB_USER_VERBS_CMD_CREATE_SRQ) |
(1ull << IB_USER_VERBS_CMD_MODIFY_SRQ) |
(1ull << IB_USER_VERBS_CMD_DESTROY_SRQ);
dev->ibdev.node_type = RDMA_NODE_RNIC;
BUILD_BUG_ON(sizeof(C4IW_NODE_DESC) > IB_DEVICE_NODE_DESC_MAX);
memcpy(dev->ibdev.node_desc, C4IW_NODE_DESC, sizeof(C4IW_NODE_DESC));
@ -608,15 +580,15 @@ void c4iw_register_device(struct work_struct *work)
dev->ibdev.mmap = c4iw_mmap;
dev->ibdev.alloc_pd = c4iw_allocate_pd;
dev->ibdev.dealloc_pd = c4iw_deallocate_pd;
dev->ibdev.create_ah = c4iw_ah_create;
dev->ibdev.destroy_ah = c4iw_ah_destroy;
dev->ibdev.create_qp = c4iw_create_qp;
dev->ibdev.modify_qp = c4iw_ib_modify_qp;
dev->ibdev.query_qp = c4iw_ib_query_qp;
dev->ibdev.destroy_qp = c4iw_destroy_qp;
dev->ibdev.create_srq = c4iw_create_srq;
dev->ibdev.modify_srq = c4iw_modify_srq;
dev->ibdev.destroy_srq = c4iw_destroy_srq;
dev->ibdev.create_cq = c4iw_create_cq;
dev->ibdev.destroy_cq = c4iw_destroy_cq;
dev->ibdev.resize_cq = c4iw_resize_cq;
dev->ibdev.poll_cq = c4iw_poll_cq;
dev->ibdev.get_dma_mr = c4iw_get_dma_mr;
dev->ibdev.reg_user_mr = c4iw_reg_user_mr;
@ -625,12 +597,10 @@ void c4iw_register_device(struct work_struct *work)
dev->ibdev.dealloc_mw = c4iw_dealloc_mw;
dev->ibdev.alloc_mr = c4iw_alloc_mr;
dev->ibdev.map_mr_sg = c4iw_map_mr_sg;
dev->ibdev.attach_mcast = c4iw_multicast_attach;
dev->ibdev.detach_mcast = c4iw_multicast_detach;
dev->ibdev.process_mad = c4iw_process_mad;
dev->ibdev.req_notify_cq = c4iw_arm_cq;
dev->ibdev.post_send = c4iw_post_send;
dev->ibdev.post_recv = c4iw_post_receive;
dev->ibdev.post_srq_recv = c4iw_post_srq_recv;
dev->ibdev.alloc_hw_stats = c4iw_alloc_stats;
dev->ibdev.get_hw_stats = c4iw_get_mib;
dev->ibdev.uverbs_abi_ver = C4IW_UVERBS_ABI_VERSION;

File diff suppressed because it is too large Load Diff

View File

@ -53,7 +53,8 @@ static int c4iw_init_qid_table(struct c4iw_rdev *rdev)
}
/* nr_* must be power of 2 */
int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid)
int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt,
u32 nr_pdid, u32 nr_srqt)
{
int err = 0;
err = c4iw_id_table_alloc(&rdev->resource.tpt_table, 0, nr_tpt, 1,
@ -67,7 +68,17 @@ int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid)
nr_pdid, 1, 0);
if (err)
goto pdid_err;
if (!nr_srqt)
err = c4iw_id_table_alloc(&rdev->resource.srq_table, 0,
1, 1, 0);
else
err = c4iw_id_table_alloc(&rdev->resource.srq_table, 0,
nr_srqt, 0, 0);
if (err)
goto srq_err;
return 0;
srq_err:
c4iw_id_table_free(&rdev->resource.pdid_table);
pdid_err:
c4iw_id_table_free(&rdev->resource.qid_table);
qid_err:
@ -371,13 +382,21 @@ void c4iw_rqtpool_free(struct c4iw_rdev *rdev, u32 addr, int size)
int c4iw_rqtpool_create(struct c4iw_rdev *rdev)
{
unsigned rqt_start, rqt_chunk, rqt_top;
int skip = 0;
rdev->rqt_pool = gen_pool_create(MIN_RQT_SHIFT, -1);
if (!rdev->rqt_pool)
return -ENOMEM;
rqt_start = rdev->lldi.vr->rq.start;
rqt_chunk = rdev->lldi.vr->rq.size;
/*
* If SRQs are supported, then never use the first RQE from
* the RQT region. This is because HW uses RQT index 0 as NULL.
*/
if (rdev->lldi.vr->srq.size)
skip = T4_RQT_ENTRY_SIZE;
rqt_start = rdev->lldi.vr->rq.start + skip;
rqt_chunk = rdev->lldi.vr->rq.size - skip;
rqt_top = rqt_start + rqt_chunk;
while (rqt_start < rqt_top) {
@ -405,6 +424,32 @@ void c4iw_rqtpool_destroy(struct c4iw_rdev *rdev)
kref_put(&rdev->rqt_kref, destroy_rqtpool);
}
int c4iw_alloc_srq_idx(struct c4iw_rdev *rdev)
{
int idx;
idx = c4iw_id_alloc(&rdev->resource.srq_table);
mutex_lock(&rdev->stats.lock);
if (idx == -1) {
rdev->stats.srqt.fail++;
mutex_unlock(&rdev->stats.lock);
return -ENOMEM;
}
rdev->stats.srqt.cur++;
if (rdev->stats.srqt.cur > rdev->stats.srqt.max)
rdev->stats.srqt.max = rdev->stats.srqt.cur;
mutex_unlock(&rdev->stats.lock);
return idx;
}
void c4iw_free_srq_idx(struct c4iw_rdev *rdev, int idx)
{
c4iw_id_free(&rdev->resource.srq_table, idx);
mutex_lock(&rdev->stats.lock);
rdev->stats.srqt.cur--;
mutex_unlock(&rdev->stats.lock);
}
/*
* On-Chip QP Memory.
*/

View File

@ -52,12 +52,16 @@ struct t4_status_page {
__be16 pidx;
u8 qp_err; /* flit 1 - sw owns */
u8 db_off;
u8 pad;
u8 pad[2];
u16 host_wq_pidx;
u16 host_cidx;
u16 host_pidx;
u16 pad2;
u32 srqidx;
};
#define T4_RQT_ENTRY_SHIFT 6
#define T4_RQT_ENTRY_SIZE BIT(T4_RQT_ENTRY_SHIFT)
#define T4_EQ_ENTRY_SIZE 64
#define T4_SQ_NUM_SLOTS 5
@ -87,6 +91,9 @@ static inline int t4_max_fr_depth(int use_dsgl)
#define T4_RQ_NUM_BYTES (T4_EQ_ENTRY_SIZE * T4_RQ_NUM_SLOTS)
#define T4_MAX_RECV_SGE 4
#define T4_WRITE_CMPL_MAX_SGL 4
#define T4_WRITE_CMPL_MAX_CQE 16
union t4_wr {
struct fw_ri_res_wr res;
struct fw_ri_wr ri;
@ -97,6 +104,7 @@ union t4_wr {
struct fw_ri_fr_nsmr_wr fr;
struct fw_ri_fr_nsmr_tpte_wr fr_tpte;
struct fw_ri_inv_lstag_wr inv;
struct fw_ri_rdma_write_cmpl_wr write_cmpl;
struct t4_status_page status;
__be64 flits[T4_EQ_ENTRY_SIZE / sizeof(__be64) * T4_SQ_NUM_SLOTS];
};
@ -179,9 +187,32 @@ struct t4_cqe {
__be32 wrid_hi;
__be32 wrid_low;
} gen;
struct {
__be32 stag;
__be32 msn;
__be32 reserved;
__be32 abs_rqe_idx;
} srcqe;
struct {
__be32 mo;
__be32 msn;
/*
* Use union for immediate data to be consistent with
* stack's 32 bit data and iWARP spec's 64 bit data.
*/
union {
struct {
__be32 imm_data32;
u32 reserved;
} ib_imm_data;
__be64 imm_data64;
} iw_imm_data;
} imm_data_rcqe;
u64 drain_cookie;
__be64 flits[3];
} u;
__be64 reserved;
__be64 reserved[3];
__be64 bits_type_ts;
};
@ -237,6 +268,9 @@ struct t4_cqe {
/* used for RQ completion processing */
#define CQE_WRID_STAG(x) (be32_to_cpu((x)->u.rcqe.stag))
#define CQE_WRID_MSN(x) (be32_to_cpu((x)->u.rcqe.msn))
#define CQE_ABS_RQE_IDX(x) (be32_to_cpu((x)->u.srcqe.abs_rqe_idx))
#define CQE_IMM_DATA(x)( \
(x)->u.imm_data_rcqe.iw_imm_data.ib_imm_data.imm_data32)
/* used for SQ completion processing */
#define CQE_WRID_SQ_IDX(x) ((x)->u.scqe.cidx)
@ -320,6 +354,7 @@ struct t4_swrqe {
u64 wr_id;
ktime_t host_time;
u64 sge_ts;
int valid;
};
struct t4_rq {
@ -349,8 +384,98 @@ struct t4_wq {
void __iomem *db;
struct c4iw_rdev *rdev;
int flushed;
u8 *qp_errp;
u32 *srqidxp;
};
struct t4_srq_pending_wr {
u64 wr_id;
union t4_recv_wr wqe;
u8 len16;
};
struct t4_srq {
union t4_recv_wr *queue;
dma_addr_t dma_addr;
DECLARE_PCI_UNMAP_ADDR(mapping);
struct t4_swrqe *sw_rq;
void __iomem *bar2_va;
u64 bar2_pa;
size_t memsize;
u32 bar2_qid;
u32 qid;
u32 msn;
u32 rqt_hwaddr;
u32 rqt_abs_idx;
u16 rqt_size;
u16 size;
u16 cidx;
u16 pidx;
u16 wq_pidx;
u16 wq_pidx_inc;
u16 in_use;
struct t4_srq_pending_wr *pending_wrs;
u16 pending_cidx;
u16 pending_pidx;
u16 pending_in_use;
u16 ooo_count;
};
static inline u32 t4_srq_avail(struct t4_srq *srq)
{
return srq->size - 1 - srq->in_use;
}
static inline void t4_srq_produce(struct t4_srq *srq, u8 len16)
{
srq->in_use++;
if (++srq->pidx == srq->size)
srq->pidx = 0;
srq->wq_pidx += DIV_ROUND_UP(len16 * 16, T4_EQ_ENTRY_SIZE);
if (srq->wq_pidx >= srq->size * T4_RQ_NUM_SLOTS)
srq->wq_pidx %= srq->size * T4_RQ_NUM_SLOTS;
srq->queue[srq->size].status.host_pidx = srq->pidx;
}
static inline void t4_srq_produce_pending_wr(struct t4_srq *srq)
{
srq->pending_in_use++;
srq->in_use++;
if (++srq->pending_pidx == srq->size)
srq->pending_pidx = 0;
}
static inline void t4_srq_consume_pending_wr(struct t4_srq *srq)
{
srq->pending_in_use--;
srq->in_use--;
if (++srq->pending_cidx == srq->size)
srq->pending_cidx = 0;
}
static inline void t4_srq_produce_ooo(struct t4_srq *srq)
{
srq->in_use--;
srq->ooo_count++;
}
static inline void t4_srq_consume_ooo(struct t4_srq *srq)
{
srq->cidx++;
if (srq->cidx == srq->size)
srq->cidx = 0;
srq->queue[srq->size].status.host_cidx = srq->cidx;
srq->ooo_count--;
}
static inline void t4_srq_consume(struct t4_srq *srq)
{
srq->in_use--;
if (++srq->cidx == srq->size)
srq->cidx = 0;
srq->queue[srq->size].status.host_cidx = srq->cidx;
}
static inline int t4_rqes_posted(struct t4_wq *wq)
{
return wq->rq.in_use;
@ -384,7 +509,6 @@ static inline void t4_rq_produce(struct t4_wq *wq, u8 len16)
static inline void t4_rq_consume(struct t4_wq *wq)
{
wq->rq.in_use--;
wq->rq.msn++;
if (++wq->rq.cidx == wq->rq.size)
wq->rq.cidx = 0;
}
@ -464,6 +588,25 @@ static inline void pio_copy(u64 __iomem *dst, u64 *src)
}
}
static inline void t4_ring_srq_db(struct t4_srq *srq, u16 inc, u8 len16,
union t4_recv_wr *wqe)
{
/* Flush host queue memory writes. */
wmb();
if (inc == 1 && srq->bar2_qid == 0 && wqe) {
pr_debug("%s : WC srq->pidx = %d; len16=%d\n",
__func__, srq->pidx, len16);
pio_copy(srq->bar2_va + SGE_UDB_WCDOORBELL, (u64 *)wqe);
} else {
pr_debug("%s: DB srq->pidx = %d; len16=%d\n",
__func__, srq->pidx, len16);
writel(PIDX_T5_V(inc) | QID_V(srq->bar2_qid),
srq->bar2_va + SGE_UDB_KDOORBELL);
}
/* Flush user doorbell area writes. */
wmb();
}
static inline void t4_ring_sq_db(struct t4_wq *wq, u16 inc, union t4_wr *wqe)
{
@ -515,12 +658,14 @@ static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc,
static inline int t4_wq_in_error(struct t4_wq *wq)
{
return wq->rq.queue[wq->rq.size].status.qp_err;
return *wq->qp_errp;
}
static inline void t4_set_wq_in_error(struct t4_wq *wq)
static inline void t4_set_wq_in_error(struct t4_wq *wq, u32 srqidx)
{
wq->rq.queue[wq->rq.size].status.qp_err = 1;
if (srqidx)
*wq->srqidxp = srqidx;
*wq->qp_errp = 1;
}
static inline void t4_disable_wq_db(struct t4_wq *wq)
@ -565,6 +710,7 @@ struct t4_cq {
u16 cidx_inc;
u8 gen;
u8 error;
u8 *qp_errp;
unsigned long flags;
};
@ -698,18 +844,18 @@ static inline int t4_next_cqe(struct t4_cq *cq, struct t4_cqe **cqe)
static inline int t4_cq_in_error(struct t4_cq *cq)
{
return ((struct t4_status_page *)&cq->queue[cq->size])->qp_err;
return *cq->qp_errp;
}
static inline void t4_set_cq_in_error(struct t4_cq *cq)
{
((struct t4_status_page *)&cq->queue[cq->size])->qp_err = 1;
*cq->qp_errp = 1;
}
#endif
struct t4_dev_status_page {
u8 db_off;
u8 pad1;
u8 write_cmpl_supported;
u16 pad2;
u32 pad3;
u64 qp_start;

View File

@ -50,7 +50,8 @@ enum fw_ri_wr_opcode {
FW_RI_BYPASS = 0xd,
FW_RI_RECEIVE = 0xe,
FW_RI_SGE_EC_CR_RETURN = 0xf
FW_RI_SGE_EC_CR_RETURN = 0xf,
FW_RI_WRITE_IMMEDIATE = FW_RI_RDMA_INIT
};
enum fw_ri_wr_flags {
@ -59,7 +60,8 @@ enum fw_ri_wr_flags {
FW_RI_SOLICITED_EVENT_FLAG = 0x04,
FW_RI_READ_FENCE_FLAG = 0x08,
FW_RI_LOCAL_FENCE_FLAG = 0x10,
FW_RI_RDMA_READ_INVALIDATE = 0x20
FW_RI_RDMA_READ_INVALIDATE = 0x20,
FW_RI_RDMA_WRITE_WITH_IMMEDIATE = 0x40
};
enum fw_ri_mpa_attrs {
@ -263,6 +265,7 @@ enum fw_ri_res_type {
FW_RI_RES_TYPE_SQ,
FW_RI_RES_TYPE_RQ,
FW_RI_RES_TYPE_CQ,
FW_RI_RES_TYPE_SRQ,
};
enum fw_ri_res_op {
@ -296,6 +299,20 @@ struct fw_ri_res {
__be32 r6_lo;
__be64 r7;
} cq;
struct fw_ri_res_srq {
__u8 restype;
__u8 op;
__be16 r3;
__be32 eqid;
__be32 r4[2];
__be32 fetchszm_to_iqid;
__be32 dcaen_to_eqsize;
__be64 eqaddr;
__be32 srqid;
__be32 pdid;
__be32 hwsrqsize;
__be32 hwsrqaddr;
} srq;
} u;
};
@ -531,7 +548,17 @@ struct fw_ri_rdma_write_wr {
__u16 wrid;
__u8 r1[3];
__u8 len16;
__be64 r2;
/*
* Use union for immediate data to be consistent with stack's 32 bit
* data and iWARP spec's 64 bit data.
*/
union {
struct {
__be32 imm_data32;
u32 reserved;
} ib_imm_data;
__be64 imm_data64;
} iw_imm_data;
__be32 plen;
__be32 stag_sink;
__be64 to_sink;
@ -568,6 +595,37 @@ struct fw_ri_send_wr {
#define FW_RI_SEND_WR_SENDOP_G(x) \
(((x) >> FW_RI_SEND_WR_SENDOP_S) & FW_RI_SEND_WR_SENDOP_M)
struct fw_ri_rdma_write_cmpl_wr {
__u8 opcode;
__u8 flags;
__u16 wrid;
__u8 r1[3];
__u8 len16;
__u8 r2;
__u8 flags_send;
__u16 wrid_send;
__be32 stag_inv;
__be32 plen;
__be32 stag_sink;
__be64 to_sink;
union fw_ri_cmpl {
struct fw_ri_immd_cmpl {
__u8 op;
__u8 r1[6];
__u8 immdlen;
__u8 data[16];
} immd_src;
struct fw_ri_isgl isgl_src;
} u_cmpl;
__be64 r3;
#ifndef C99_NOT_SUPPORTED
union fw_ri_write {
struct fw_ri_immd immd_src[0];
struct fw_ri_isgl isgl_src[0];
} u;
#endif
};
struct fw_ri_rdma_read_wr {
__u8 opcode;
__u8 flags;
@ -707,6 +765,10 @@ enum fw_ri_init_p2ptype {
FW_RI_INIT_P2PTYPE_DISABLED = 0xf,
};
enum fw_ri_init_rqeqid_srq {
FW_RI_INIT_RQEQID_SRQ = 1 << 31,
};
struct fw_ri_wr {
__be32 op_compl;
__be32 flowid_len16;

View File

@ -8143,8 +8143,15 @@ static void is_sdma_eng_int(struct hfi1_devdata *dd, unsigned int source)
}
}
/*
/**
* is_rcv_avail_int() - User receive context available IRQ handler
* @dd: valid dd
* @source: logical IRQ source (offset from IS_RCVAVAIL_START)
*
* RX block receive available interrupt. Source is < 160.
*
* This is the general interrupt handler for user (PSM) receive contexts,
* and can only be used for non-threaded IRQs.
*/
static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
{
@ -8154,12 +8161,7 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
if (likely(source < dd->num_rcv_contexts)) {
rcd = hfi1_rcd_get_by_index(dd, source);
if (rcd) {
/* Check for non-user contexts, including vnic */
if (source < dd->first_dyn_alloc_ctxt || rcd->is_vnic)
rcd->do_interrupt(rcd, 0);
else
handle_user_interrupt(rcd);
hfi1_rcd_put(rcd);
return; /* OK */
}
@ -8173,8 +8175,14 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
err_detail, source);
}
/*
/**
* is_rcv_urgent_int() - User receive context urgent IRQ handler
* @dd: valid dd
* @source: logical IRQ source (ofse from IS_RCVURGENT_START)
*
* RX block receive urgent interrupt. Source is < 160.
*
* NOTE: kernel receive contexts specifically do NOT enable this IRQ.
*/
static void is_rcv_urgent_int(struct hfi1_devdata *dd, unsigned int source)
{
@ -8184,11 +8192,7 @@ static void is_rcv_urgent_int(struct hfi1_devdata *dd, unsigned int source)
if (likely(source < dd->num_rcv_contexts)) {
rcd = hfi1_rcd_get_by_index(dd, source);
if (rcd) {
/* only pay attention to user urgent interrupts */
if (source >= dd->first_dyn_alloc_ctxt &&
!rcd->is_vnic)
handle_user_interrupt(rcd);
hfi1_rcd_put(rcd);
return; /* OK */
}
@ -8260,9 +8264,14 @@ static void is_interrupt(struct hfi1_devdata *dd, unsigned int source)
dd_dev_err(dd, "invalid interrupt source %u\n", source);
}
/*
* General interrupt handler. This is able to correctly handle
* all interrupts in case INTx is used.
/**
* gerneral_interrupt() - General interrupt handler
* @irq: MSIx IRQ vector
* @data: hfi1 devdata
*
* This is able to correctly handle all non-threaded interrupts. Receive
* context DATA IRQs are threaded and are not supported by this handler.
*
*/
static irqreturn_t general_interrupt(int irq, void *data)
{
@ -10130,7 +10139,7 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
(((lid & mask) & SEND_CTXT_CHECK_SLID_VALUE_MASK) <<
SEND_CTXT_CHECK_SLID_VALUE_SHIFT);
for (i = 0; i < dd->chip_send_contexts; i++) {
for (i = 0; i < chip_send_contexts(dd); i++) {
hfi1_cdbg(LINKVERB, "SendContext[%d].SLID_CHECK = 0x%x",
i, (u32)sreg);
write_kctxt_csr(dd, i, SEND_CTXT_CHECK_SLID, sreg);
@ -11857,7 +11866,7 @@ void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op,
* sequence numbers could land exactly on the same spot.
* E.g. a rcd restart before the receive header wrapped.
*/
memset(rcd->rcvhdrq, 0, rcd->rcvhdrq_size);
memset(rcd->rcvhdrq, 0, rcvhdrq_size(rcd));
/* starting timeout */
rcd->rcvavail_timeout = dd->rcv_intr_timeout_csr;
@ -11952,9 +11961,8 @@ void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op,
rcvctrl |= RCV_CTXT_CTRL_DONT_DROP_EGR_FULL_SMASK;
if (op & HFI1_RCVCTRL_NO_EGR_DROP_DIS)
rcvctrl &= ~RCV_CTXT_CTRL_DONT_DROP_EGR_FULL_SMASK;
rcd->rcvctrl = rcvctrl;
hfi1_cdbg(RCVCTRL, "ctxt %d rcvctrl 0x%llx\n", ctxt, rcvctrl);
write_kctxt_csr(dd, ctxt, RCV_CTXT_CTRL, rcd->rcvctrl);
write_kctxt_csr(dd, ctxt, RCV_CTXT_CTRL, rcvctrl);
/* work around sticky RcvCtxtStatus.BlockedRHQFull */
if (did_enable &&
@ -12042,7 +12050,7 @@ u32 hfi1_read_cntrs(struct hfi1_devdata *dd, char **namep, u64 **cntrp)
} else if (entry->flags & CNTR_SDMA) {
hfi1_cdbg(CNTR,
"\t Per SDMA Engine\n");
for (j = 0; j < dd->chip_sdma_engines;
for (j = 0; j < chip_sdma_engines(dd);
j++) {
val =
entry->rw_cntr(entry, dd, j,
@ -12418,6 +12426,7 @@ static int init_cntrs(struct hfi1_devdata *dd)
struct hfi1_pportdata *ppd;
const char *bit_type_32 = ",32";
const int bit_type_32_sz = strlen(bit_type_32);
u32 sdma_engines = chip_sdma_engines(dd);
/* set up the stats timer; the add_timer is done at the end */
timer_setup(&dd->synth_stats_timer, update_synth_timer, 0);
@ -12450,7 +12459,7 @@ static int init_cntrs(struct hfi1_devdata *dd)
}
} else if (dev_cntrs[i].flags & CNTR_SDMA) {
dev_cntrs[i].offset = dd->ndevcntrs;
for (j = 0; j < dd->chip_sdma_engines; j++) {
for (j = 0; j < sdma_engines; j++) {
snprintf(name, C_MAX_NAME, "%s%d",
dev_cntrs[i].name, j);
sz += strlen(name);
@ -12507,7 +12516,7 @@ static int init_cntrs(struct hfi1_devdata *dd)
*p++ = '\n';
}
} else if (dev_cntrs[i].flags & CNTR_SDMA) {
for (j = 0; j < dd->chip_sdma_engines; j++) {
for (j = 0; j < sdma_engines; j++) {
snprintf(name, C_MAX_NAME, "%s%d",
dev_cntrs[i].name, j);
memcpy(p, name, strlen(name));
@ -13020,9 +13029,9 @@ static void clear_all_interrupts(struct hfi1_devdata *dd)
write_csr(dd, SEND_PIO_ERR_CLEAR, ~(u64)0);
write_csr(dd, SEND_DMA_ERR_CLEAR, ~(u64)0);
write_csr(dd, SEND_EGRESS_ERR_CLEAR, ~(u64)0);
for (i = 0; i < dd->chip_send_contexts; i++)
for (i = 0; i < chip_send_contexts(dd); i++)
write_kctxt_csr(dd, i, SEND_CTXT_ERR_CLEAR, ~(u64)0);
for (i = 0; i < dd->chip_sdma_engines; i++)
for (i = 0; i < chip_sdma_engines(dd); i++)
write_kctxt_csr(dd, i, SEND_DMA_ENG_ERR_CLEAR, ~(u64)0);
write_csr(dd, DCC_ERR_FLG_CLR, ~(u64)0);
@ -13030,28 +13039,18 @@ static void clear_all_interrupts(struct hfi1_devdata *dd)
write_csr(dd, DC_DC8051_ERR_CLR, ~(u64)0);
}
/* Move to pcie.c? */
static void disable_intx(struct pci_dev *pdev)
{
pci_intx(pdev, 0);
}
/**
* hfi1_clean_up_interrupts() - Free all IRQ resources
* @dd: valid device data data structure
*
* Free the MSI or INTx IRQs and assoicated PCI resources,
* if they have been allocated.
* Free the MSIx and assoicated PCI resources, if they have been allocated.
*/
void hfi1_clean_up_interrupts(struct hfi1_devdata *dd)
{
int i;
/* remove irqs - must happen before disabling/turning off */
if (dd->num_msix_entries) {
/* MSI-X */
struct hfi1_msix_entry *me = dd->msix_entries;
/* remove irqs - must happen before disabling/turning off */
for (i = 0; i < dd->num_msix_entries; i++, me++) {
if (!me->arg) /* => no irq, no affinity */
continue;
@ -13063,14 +13062,6 @@ void hfi1_clean_up_interrupts(struct hfi1_devdata *dd)
kfree(dd->msix_entries);
dd->msix_entries = NULL;
dd->num_msix_entries = 0;
} else {
/* INTx */
if (dd->requested_intx_irq) {
pci_free_irq(dd->pcidev, 0, dd);
dd->requested_intx_irq = 0;
}
disable_intx(dd->pcidev);
}
pci_free_irq_vectors(dd->pcidev);
}
@ -13121,20 +13112,6 @@ static void remap_sdma_interrupts(struct hfi1_devdata *dd,
msix_intr);
}
static int request_intx_irq(struct hfi1_devdata *dd)
{
int ret;
ret = pci_request_irq(dd->pcidev, 0, general_interrupt, NULL, dd,
DRIVER_NAME "_%d", dd->unit);
if (ret)
dd_dev_err(dd, "unable to request INTx interrupt, err %d\n",
ret);
else
dd->requested_intx_irq = 1;
return ret;
}
static int request_msix_irqs(struct hfi1_devdata *dd)
{
int first_general, last_general;
@ -13253,11 +13230,6 @@ void hfi1_vnic_synchronize_irq(struct hfi1_devdata *dd)
{
int i;
if (!dd->num_msix_entries) {
synchronize_irq(pci_irq_vector(dd->pcidev, 0));
return;
}
for (i = 0; i < dd->vnic.num_ctxt; i++) {
struct hfi1_ctxtdata *rcd = dd->vnic.ctxt[i];
struct hfi1_msix_entry *me = &dd->msix_entries[rcd->msix_intr];
@ -13346,7 +13318,6 @@ static int set_up_interrupts(struct hfi1_devdata *dd)
{
u32 total;
int ret, request;
int single_interrupt = 0; /* we expect to have all the interrupts */
/*
* Interrupt count:
@ -13363,17 +13334,6 @@ static int set_up_interrupts(struct hfi1_devdata *dd)
if (request < 0) {
ret = request;
goto fail;
} else if (request == 0) {
/* using INTx */
/* dd->num_msix_entries already zero */
single_interrupt = 1;
dd_dev_err(dd, "MSI-X failed, using INTx interrupts\n");
} else if (request < total) {
/* using MSI-X, with reduced interrupts */
dd_dev_err(dd, "reduced interrupt found, wanted %u, got %u\n",
total, request);
ret = -EINVAL;
goto fail;
} else {
dd->msix_entries = kcalloc(total, sizeof(*dd->msix_entries),
GFP_KERNEL);
@ -13394,9 +13354,6 @@ static int set_up_interrupts(struct hfi1_devdata *dd)
/* reset general handler mask, chip MSI-X mappings */
reset_interrupts(dd);
if (single_interrupt)
ret = request_intx_irq(dd);
else
ret = request_msix_irqs(dd);
if (ret)
goto fail;
@ -13429,6 +13386,8 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
int qos_rmt_count;
int user_rmt_reduced;
u32 n_usr_ctxts;
u32 send_contexts = chip_send_contexts(dd);
u32 rcv_contexts = chip_rcv_contexts(dd);
/*
* Kernel receive contexts:
@ -13450,16 +13409,16 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
* Every kernel receive context needs an ACK send context.
* one send context is allocated for each VL{0-7} and VL15
*/
if (num_kernel_contexts > (dd->chip_send_contexts - num_vls - 1)) {
if (num_kernel_contexts > (send_contexts - num_vls - 1)) {
dd_dev_err(dd,
"Reducing # kernel rcv contexts to: %d, from %lu\n",
(int)(dd->chip_send_contexts - num_vls - 1),
send_contexts - num_vls - 1,
num_kernel_contexts);
num_kernel_contexts = dd->chip_send_contexts - num_vls - 1;
num_kernel_contexts = send_contexts - num_vls - 1;
}
/* Accommodate VNIC contexts if possible */
if ((num_kernel_contexts + num_vnic_contexts) > dd->chip_rcv_contexts) {
if ((num_kernel_contexts + num_vnic_contexts) > rcv_contexts) {
dd_dev_err(dd, "No receive contexts available for VNIC\n");
num_vnic_contexts = 0;
}
@ -13477,13 +13436,13 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
/*
* Adjust the counts given a global max.
*/
if (total_contexts + n_usr_ctxts > dd->chip_rcv_contexts) {
if (total_contexts + n_usr_ctxts > rcv_contexts) {
dd_dev_err(dd,
"Reducing # user receive contexts to: %d, from %u\n",
(int)(dd->chip_rcv_contexts - total_contexts),
rcv_contexts - total_contexts,
n_usr_ctxts);
/* recalculate */
n_usr_ctxts = dd->chip_rcv_contexts - total_contexts;
n_usr_ctxts = rcv_contexts - total_contexts;
}
/* each user context requires an entry in the RMT */
@ -13509,7 +13468,7 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
dd->freectxts = n_usr_ctxts;
dd_dev_info(dd,
"rcv contexts: chip %d, used %d (kernel %d, vnic %u, user %u)\n",
(int)dd->chip_rcv_contexts,
rcv_contexts,
(int)dd->num_rcv_contexts,
(int)dd->n_krcv_queues,
dd->num_vnic_contexts,
@ -13527,7 +13486,7 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
* contexts.
*/
dd->rcv_entries.group_size = RCV_INCREMENT;
ngroups = dd->chip_rcv_array_count / dd->rcv_entries.group_size;
ngroups = chip_rcv_array_count(dd) / dd->rcv_entries.group_size;
dd->rcv_entries.ngroups = ngroups / dd->num_rcv_contexts;
dd->rcv_entries.nctxt_extra = ngroups -
(dd->num_rcv_contexts * dd->rcv_entries.ngroups);
@ -13552,7 +13511,7 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
dd_dev_info(
dd,
"send contexts: chip %d, used %d (kernel %d, ack %d, user %d, vl15 %d)\n",
dd->chip_send_contexts,
send_contexts,
dd->num_send_contexts,
dd->sc_sizes[SC_KERNEL].count,
dd->sc_sizes[SC_ACK].count,
@ -13610,7 +13569,7 @@ static void write_uninitialized_csrs_and_memories(struct hfi1_devdata *dd)
write_csr(dd, CCE_INT_MAP + (8 * i), 0);
/* SendCtxtCreditReturnAddr */
for (i = 0; i < dd->chip_send_contexts; i++)
for (i = 0; i < chip_send_contexts(dd); i++)
write_kctxt_csr(dd, i, SEND_CTXT_CREDIT_RETURN_ADDR, 0);
/* PIO Send buffers */
@ -13623,7 +13582,7 @@ static void write_uninitialized_csrs_and_memories(struct hfi1_devdata *dd)
/* RcvHdrAddr */
/* RcvHdrTailAddr */
/* RcvTidFlowTable */
for (i = 0; i < dd->chip_rcv_contexts; i++) {
for (i = 0; i < chip_rcv_contexts(dd); i++) {
write_kctxt_csr(dd, i, RCV_HDR_ADDR, 0);
write_kctxt_csr(dd, i, RCV_HDR_TAIL_ADDR, 0);
for (j = 0; j < RXE_NUM_TID_FLOWS; j++)
@ -13631,7 +13590,7 @@ static void write_uninitialized_csrs_and_memories(struct hfi1_devdata *dd)
}
/* RcvArray */
for (i = 0; i < dd->chip_rcv_array_count; i++)
for (i = 0; i < chip_rcv_array_count(dd); i++)
hfi1_put_tid(dd, i, PT_INVALID_FLUSH, 0, 0);
/* RcvQPMapTable */
@ -13789,7 +13748,7 @@ static void reset_txe_csrs(struct hfi1_devdata *dd)
write_csr(dd, SEND_LOW_PRIORITY_LIST + (8 * i), 0);
for (i = 0; i < VL_ARB_HIGH_PRIO_TABLE_SIZE; i++)
write_csr(dd, SEND_HIGH_PRIORITY_LIST + (8 * i), 0);
for (i = 0; i < dd->chip_send_contexts / NUM_CONTEXTS_PER_SET; i++)
for (i = 0; i < chip_send_contexts(dd) / NUM_CONTEXTS_PER_SET; i++)
write_csr(dd, SEND_CONTEXT_SET_CTRL + (8 * i), 0);
for (i = 0; i < TXE_NUM_32_BIT_COUNTER; i++)
write_csr(dd, SEND_COUNTER_ARRAY32 + (8 * i), 0);
@ -13817,7 +13776,7 @@ static void reset_txe_csrs(struct hfi1_devdata *dd)
/*
* TXE Per-Context CSRs
*/
for (i = 0; i < dd->chip_send_contexts; i++) {
for (i = 0; i < chip_send_contexts(dd); i++) {
write_kctxt_csr(dd, i, SEND_CTXT_CTRL, 0);
write_kctxt_csr(dd, i, SEND_CTXT_CREDIT_CTRL, 0);
write_kctxt_csr(dd, i, SEND_CTXT_CREDIT_RETURN_ADDR, 0);
@ -13835,7 +13794,7 @@ static void reset_txe_csrs(struct hfi1_devdata *dd)
/*
* TXE Per-SDMA CSRs
*/
for (i = 0; i < dd->chip_sdma_engines; i++) {
for (i = 0; i < chip_sdma_engines(dd); i++) {
write_kctxt_csr(dd, i, SEND_DMA_CTRL, 0);
/* SEND_DMA_STATUS read-only */
write_kctxt_csr(dd, i, SEND_DMA_BASE_ADDR, 0);
@ -13968,7 +13927,7 @@ static void reset_rxe_csrs(struct hfi1_devdata *dd)
/*
* RXE Kernel and User Per-Context CSRs
*/
for (i = 0; i < dd->chip_rcv_contexts; i++) {
for (i = 0; i < chip_rcv_contexts(dd); i++) {
/* kernel */
write_kctxt_csr(dd, i, RCV_CTXT_CTRL, 0);
/* RCV_CTXT_STATUS read-only */
@ -14084,13 +14043,13 @@ static int init_chip(struct hfi1_devdata *dd)
/* disable send contexts and SDMA engines */
write_csr(dd, SEND_CTRL, 0);
for (i = 0; i < dd->chip_send_contexts; i++)
for (i = 0; i < chip_send_contexts(dd); i++)
write_kctxt_csr(dd, i, SEND_CTXT_CTRL, 0);
for (i = 0; i < dd->chip_sdma_engines; i++)
for (i = 0; i < chip_sdma_engines(dd); i++)
write_kctxt_csr(dd, i, SEND_DMA_CTRL, 0);
/* disable port (turn off RXE inbound traffic) and contexts */
write_csr(dd, RCV_CTRL, 0);
for (i = 0; i < dd->chip_rcv_contexts; i++)
for (i = 0; i < chip_rcv_contexts(dd); i++)
write_csr(dd, RCV_CTXT_CTRL, 0);
/* mask all interrupt sources */
for (i = 0; i < CCE_NUM_INT_CSRS; i++)
@ -14709,9 +14668,9 @@ static void init_txe(struct hfi1_devdata *dd)
write_csr(dd, SEND_EGRESS_ERR_MASK, ~0ull);
/* enable all per-context and per-SDMA engine errors */
for (i = 0; i < dd->chip_send_contexts; i++)
for (i = 0; i < chip_send_contexts(dd); i++)
write_kctxt_csr(dd, i, SEND_CTXT_ERR_MASK, ~0ull);
for (i = 0; i < dd->chip_sdma_engines; i++)
for (i = 0; i < chip_sdma_engines(dd); i++)
write_kctxt_csr(dd, i, SEND_DMA_ENG_ERR_MASK, ~0ull);
/* set the local CU to AU mapping */
@ -14979,11 +14938,13 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
"Functional simulator"
};
struct pci_dev *parent = pdev->bus->self;
u32 sdma_engines;
dd = hfi1_alloc_devdata(pdev, NUM_IB_PORTS *
sizeof(struct hfi1_pportdata));
if (IS_ERR(dd))
goto bail;
sdma_engines = chip_sdma_engines(dd);
ppd = dd->pport;
for (i = 0; i < dd->num_pports; i++, ppd++) {
int vl;
@ -15081,11 +15042,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
/* give a reasonable active value, will be set on link up */
dd->pport->link_speed_active = OPA_LINK_SPEED_25G;
dd->chip_rcv_contexts = read_csr(dd, RCV_CONTEXTS);
dd->chip_send_contexts = read_csr(dd, SEND_CONTEXTS);
dd->chip_sdma_engines = read_csr(dd, SEND_DMA_ENGINES);
dd->chip_pio_mem_size = read_csr(dd, SEND_PIO_MEM_SIZE);
dd->chip_sdma_mem_size = read_csr(dd, SEND_DMA_MEM_SIZE);
/* fix up link widths for emulation _p */
ppd = dd->pport;
if (dd->icode == ICODE_FPGA_EMULATION && is_emulator_p(dd)) {
@ -15096,11 +15052,11 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
OPA_LINK_WIDTH_1X;
}
/* insure num_vls isn't larger than number of sdma engines */
if (HFI1_CAP_IS_KSET(SDMA) && num_vls > dd->chip_sdma_engines) {
if (HFI1_CAP_IS_KSET(SDMA) && num_vls > sdma_engines) {
dd_dev_err(dd, "num_vls %u too large, using %u VLs\n",
num_vls, dd->chip_sdma_engines);
num_vls = dd->chip_sdma_engines;
ppd->vls_supported = dd->chip_sdma_engines;
num_vls, sdma_engines);
num_vls = sdma_engines;
ppd->vls_supported = sdma_engines;
ppd->vls_operational = ppd->vls_supported;
}
@ -15216,13 +15172,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
*/
aspm_init(dd);
dd->rcvhdrsize = DEFAULT_RCVHDRSIZE;
/*
* rcd[0] is guaranteed to be valid by this point. Also, all
* context are using the same value, as per the module parameter.
*/
dd->rhf_offset = dd->rcd[0]->rcvhdrqentsize - sizeof(u64) / sizeof(u32);
ret = init_pervl_scs(dd);
if (ret)
goto bail_cleanup;

View File

@ -656,6 +656,36 @@ static inline void write_uctxt_csr(struct hfi1_devdata *dd, int ctxt,
write_csr(dd, offset0 + (0x1000 * ctxt), value);
}
static inline u32 chip_rcv_contexts(struct hfi1_devdata *dd)
{
return read_csr(dd, RCV_CONTEXTS);
}
static inline u32 chip_send_contexts(struct hfi1_devdata *dd)
{
return read_csr(dd, SEND_CONTEXTS);
}
static inline u32 chip_sdma_engines(struct hfi1_devdata *dd)
{
return read_csr(dd, SEND_DMA_ENGINES);
}
static inline u32 chip_pio_mem_size(struct hfi1_devdata *dd)
{
return read_csr(dd, SEND_PIO_MEM_SIZE);
}
static inline u32 chip_sdma_mem_size(struct hfi1_devdata *dd)
{
return read_csr(dd, SEND_DMA_MEM_SIZE);
}
static inline u32 chip_rcv_array_count(struct hfi1_devdata *dd)
{
return read_csr(dd, RCV_ARRAY_CNT);
}
u64 create_pbc(struct hfi1_pportdata *ppd, u64 flags, int srate_mbs, u32 vl,
u32 dw_len);

View File

@ -208,25 +208,25 @@ static inline void *get_egrbuf(const struct hfi1_ctxtdata *rcd, u64 rhf,
(offset * RCV_BUF_BLOCK_SIZE));
}
static inline void *hfi1_get_header(struct hfi1_devdata *dd,
static inline void *hfi1_get_header(struct hfi1_ctxtdata *rcd,
__le32 *rhf_addr)
{
u32 offset = rhf_hdrq_offset(rhf_to_cpu(rhf_addr));
return (void *)(rhf_addr - dd->rhf_offset + offset);
return (void *)(rhf_addr - rcd->rhf_offset + offset);
}
static inline struct ib_header *hfi1_get_msgheader(struct hfi1_devdata *dd,
static inline struct ib_header *hfi1_get_msgheader(struct hfi1_ctxtdata *rcd,
__le32 *rhf_addr)
{
return (struct ib_header *)hfi1_get_header(dd, rhf_addr);
return (struct ib_header *)hfi1_get_header(rcd, rhf_addr);
}
static inline struct hfi1_16b_header
*hfi1_get_16B_header(struct hfi1_devdata *dd,
*hfi1_get_16B_header(struct hfi1_ctxtdata *rcd,
__le32 *rhf_addr)
{
return (struct hfi1_16b_header *)hfi1_get_header(dd, rhf_addr);
return (struct hfi1_16b_header *)hfi1_get_header(rcd, rhf_addr);
}
/*
@ -591,13 +591,12 @@ static void __prescan_rxq(struct hfi1_packet *packet)
init_ps_mdata(&mdata, packet);
while (1) {
struct hfi1_devdata *dd = rcd->dd;
struct hfi1_ibport *ibp = rcd_to_iport(rcd);
__le32 *rhf_addr = (__le32 *)rcd->rcvhdrq + mdata.ps_head +
dd->rhf_offset;
packet->rcd->rhf_offset;
struct rvt_qp *qp;
struct ib_header *hdr;
struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
struct rvt_dev_info *rdi = &rcd->dd->verbs_dev.rdi;
u64 rhf = rhf_to_cpu(rhf_addr);
u32 etype = rhf_rcv_type(rhf), qpn, bth1;
int is_ecn = 0;
@ -612,7 +611,7 @@ static void __prescan_rxq(struct hfi1_packet *packet)
if (etype != RHF_RCV_TYPE_IB)
goto next;
packet->hdr = hfi1_get_msgheader(dd, rhf_addr);
packet->hdr = hfi1_get_msgheader(packet->rcd, rhf_addr);
hdr = packet->hdr;
lnh = ib_get_lnh(hdr);
@ -718,7 +717,7 @@ static noinline int skip_rcv_packet(struct hfi1_packet *packet, int thread)
ret = check_max_packet(packet, thread);
packet->rhf_addr = (__le32 *)packet->rcd->rcvhdrq + packet->rhqoff +
packet->rcd->dd->rhf_offset;
packet->rcd->rhf_offset;
packet->rhf = rhf_to_cpu(packet->rhf_addr);
return ret;
@ -757,7 +756,7 @@ static inline int process_rcv_packet(struct hfi1_packet *packet, int thread)
* crashing down. There is no need to eat another
* comparison in this performance critical code.
*/
packet->rcd->dd->rhf_rcv_function_map[packet->etype](packet);
packet->rcd->rhf_rcv_function_map[packet->etype](packet);
packet->numpkt++;
/* Set up for the next packet */
@ -768,7 +767,7 @@ static inline int process_rcv_packet(struct hfi1_packet *packet, int thread)
ret = check_max_packet(packet, thread);
packet->rhf_addr = (__le32 *)packet->rcd->rcvhdrq + packet->rhqoff +
packet->rcd->dd->rhf_offset;
packet->rcd->rhf_offset;
packet->rhf = rhf_to_cpu(packet->rhf_addr);
return ret;
@ -949,12 +948,12 @@ static inline int set_armed_to_active(struct hfi1_ctxtdata *rcd,
u8 sc = SC15_PACKET;
if (etype == RHF_RCV_TYPE_IB) {
struct ib_header *hdr = hfi1_get_msgheader(packet->rcd->dd,
struct ib_header *hdr = hfi1_get_msgheader(packet->rcd,
packet->rhf_addr);
sc = hfi1_9B_get_sc5(hdr, packet->rhf);
} else if (etype == RHF_RCV_TYPE_BYPASS) {
struct hfi1_16b_header *hdr = hfi1_get_16B_header(
packet->rcd->dd,
packet->rcd,
packet->rhf_addr);
sc = hfi1_16B_get_sc(hdr);
}
@ -1034,7 +1033,7 @@ int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, int thread)
packet.rhqoff += packet.rsize;
packet.rhf_addr = (__le32 *)rcd->rcvhdrq +
packet.rhqoff +
dd->rhf_offset;
rcd->rhf_offset;
packet.rhf = rhf_to_cpu(packet.rhf_addr);
} else if (skip_pkt) {
@ -1384,7 +1383,7 @@ bail:
static inline void hfi1_setup_ib_header(struct hfi1_packet *packet)
{
packet->hdr = (struct hfi1_ib_message_header *)
hfi1_get_msgheader(packet->rcd->dd,
hfi1_get_msgheader(packet->rcd,
packet->rhf_addr);
packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
}
@ -1485,7 +1484,7 @@ static int hfi1_setup_bypass_packet(struct hfi1_packet *packet)
u8 l4;
packet->hdr = (struct hfi1_16b_header *)
hfi1_get_16B_header(packet->rcd->dd,
hfi1_get_16B_header(packet->rcd,
packet->rhf_addr);
l4 = hfi1_16B_get_l4(packet->hdr);
if (l4 == OPA_16B_L4_IB_LOCAL) {
@ -1575,7 +1574,7 @@ void handle_eflags(struct hfi1_packet *packet)
* The following functions are called by the interrupt handler. They are type
* specific handlers for each packet type.
*/
int process_receive_ib(struct hfi1_packet *packet)
static int process_receive_ib(struct hfi1_packet *packet)
{
if (hfi1_setup_9B_packet(packet))
return RHF_RCV_CONTINUE;
@ -1607,7 +1606,7 @@ static inline bool hfi1_is_vnic_packet(struct hfi1_packet *packet)
return false;
}
int process_receive_bypass(struct hfi1_packet *packet)
static int process_receive_bypass(struct hfi1_packet *packet)
{
struct hfi1_devdata *dd = packet->rcd->dd;
@ -1649,7 +1648,7 @@ int process_receive_bypass(struct hfi1_packet *packet)
return RHF_RCV_CONTINUE;
}
int process_receive_error(struct hfi1_packet *packet)
static int process_receive_error(struct hfi1_packet *packet)
{
/* KHdrHCRCErr -- KDETH packet with a bad HCRC */
if (unlikely(
@ -1668,7 +1667,7 @@ int process_receive_error(struct hfi1_packet *packet)
return RHF_RCV_CONTINUE;
}
int kdeth_process_expected(struct hfi1_packet *packet)
static int kdeth_process_expected(struct hfi1_packet *packet)
{
hfi1_setup_9B_packet(packet);
if (unlikely(hfi1_dbg_should_fault_rx(packet)))
@ -1682,7 +1681,7 @@ int kdeth_process_expected(struct hfi1_packet *packet)
return RHF_RCV_CONTINUE;
}
int kdeth_process_eager(struct hfi1_packet *packet)
static int kdeth_process_eager(struct hfi1_packet *packet)
{
hfi1_setup_9B_packet(packet);
if (unlikely(hfi1_dbg_should_fault_rx(packet)))
@ -1695,7 +1694,7 @@ int kdeth_process_eager(struct hfi1_packet *packet)
return RHF_RCV_CONTINUE;
}
int process_receive_invalid(struct hfi1_packet *packet)
static int process_receive_invalid(struct hfi1_packet *packet)
{
dd_dev_err(packet->rcd->dd, "Invalid packet type %d. Dropping\n",
rhf_rcv_type(packet->rhf));
@ -1719,9 +1718,8 @@ void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd)
init_ps_mdata(&mdata, &packet);
while (1) {
struct hfi1_devdata *dd = rcd->dd;
__le32 *rhf_addr = (__le32 *)rcd->rcvhdrq + mdata.ps_head +
dd->rhf_offset;
rcd->rhf_offset;
struct ib_header *hdr;
u64 rhf = rhf_to_cpu(rhf_addr);
u32 etype = rhf_rcv_type(rhf), qpn;
@ -1738,7 +1736,7 @@ void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd)
if (etype > RHF_RCV_TYPE_IB)
goto next;
packet.hdr = hfi1_get_msgheader(dd, rhf_addr);
packet.hdr = hfi1_get_msgheader(rcd, rhf_addr);
hdr = packet.hdr;
lnh = be16_to_cpu(hdr->lrh[0]) & 3;
@ -1760,3 +1758,14 @@ next:
update_ps_mdata(&mdata, rcd);
}
}
const rhf_rcv_function_ptr normal_rhf_rcv_functions[] = {
[RHF_RCV_TYPE_EXPECTED] = kdeth_process_expected,
[RHF_RCV_TYPE_EAGER] = kdeth_process_eager,
[RHF_RCV_TYPE_IB] = process_receive_ib,
[RHF_RCV_TYPE_ERROR] = process_receive_error,
[RHF_RCV_TYPE_BYPASS] = process_receive_bypass,
[RHF_RCV_TYPE_INVALID5] = process_receive_invalid,
[RHF_RCV_TYPE_INVALID6] = process_receive_invalid,
[RHF_RCV_TYPE_INVALID7] = process_receive_invalid,
};

View File

@ -411,7 +411,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_area_struct *vma)
mapio = 1;
break;
case RCV_HDRQ:
memlen = uctxt->rcvhdrq_size;
memlen = rcvhdrq_size(uctxt);
memvirt = uctxt->rcvhdrq;
break;
case RCV_EGRBUF: {
@ -521,7 +521,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_area_struct *vma)
break;
case SUBCTXT_RCV_HDRQ:
memaddr = (u64)uctxt->subctxt_rcvhdr_base;
memlen = uctxt->rcvhdrq_size * uctxt->subctxt_cnt;
memlen = rcvhdrq_size(uctxt) * uctxt->subctxt_cnt;
flags |= VM_IO | VM_DONTEXPAND;
vmf = 1;
break;
@ -985,7 +985,11 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
* sub contexts.
* This has to be done here so the rest of the sub-contexts find the
* proper base context.
* NOTE: _set_bit() can be used here because the context creation is
* protected by the mutex (rather than the spin_lock), and will be the
* very first instance of this context.
*/
__set_bit(0, uctxt->in_use_ctxts);
if (uinfo->subctxt_cnt)
init_subctxts(uctxt, uinfo);
uctxt->userversion = uinfo->userversion;
@ -1040,7 +1044,7 @@ static int setup_subctxt(struct hfi1_ctxtdata *uctxt)
return -ENOMEM;
/* We can take the size of the RcvHdr Queue from the master */
uctxt->subctxt_rcvhdr_base = vmalloc_user(uctxt->rcvhdrq_size *
uctxt->subctxt_rcvhdr_base = vmalloc_user(rcvhdrq_size(uctxt) *
num_subctxts);
if (!uctxt->subctxt_rcvhdr_base) {
ret = -ENOMEM;

View File

@ -169,12 +169,6 @@ extern const struct pci_error_handlers hfi1_pci_err_handler;
struct hfi1_opcode_stats_perctx;
struct ctxt_eager_bufs {
ssize_t size; /* total size of eager buffers */
u32 count; /* size of buffers array */
u32 numbufs; /* number of buffers allocated */
u32 alloced; /* number of rcvarray entries used */
u32 rcvtid_size; /* size of each eager rcv tid */
u32 threshold; /* head update threshold */
struct eager_buffer {
void *addr;
dma_addr_t dma;
@ -184,6 +178,12 @@ struct ctxt_eager_bufs {
void *addr;
dma_addr_t dma;
} *rcvtids;
u32 size; /* total size of eager buffers */
u32 rcvtid_size; /* size of each eager rcv tid */
u16 count; /* size of buffers array */
u16 numbufs; /* number of buffers allocated */
u16 alloced; /* number of rcvarray entries used */
u16 threshold; /* head update threshold */
};
struct exp_tid_set {
@ -191,43 +191,84 @@ struct exp_tid_set {
u32 count;
};
typedef int (*rhf_rcv_function_ptr)(struct hfi1_packet *packet);
struct hfi1_ctxtdata {
/* shadow the ctxt's RcvCtrl register */
u64 rcvctrl;
/* rcvhdrq base, needs mmap before useful */
void *rcvhdrq;
/* kernel virtual address where hdrqtail is updated */
volatile __le64 *rcvhdrtail_kvaddr;
/* when waiting for rcv or pioavail */
wait_queue_head_t wait;
/* rcvhdrq size (for freeing) */
size_t rcvhdrq_size;
/* so functions that need physical port can get it easily */
struct hfi1_pportdata *ppd;
/* so file ops can get at unit */
struct hfi1_devdata *dd;
/* this receive context's assigned PIO ACK send context */
struct send_context *sc;
/* per context recv functions */
const rhf_rcv_function_ptr *rhf_rcv_function_map;
/*
* The interrupt handler for a particular receive context can vary
* throughout it's lifetime. This is not a lock protected data member so
* it must be updated atomically and the prev and new value must always
* be valid. Worst case is we process an extra interrupt and up to 64
* packets with the wrong interrupt handler.
*/
int (*do_interrupt)(struct hfi1_ctxtdata *rcd, int threaded);
/* verbs rx_stats per rcd */
struct hfi1_opcode_stats_perctx *opstats;
/* clear interrupt mask */
u64 imask;
/* ctxt rcvhdrq head offset */
u32 head;
/* number of rcvhdrq entries */
u16 rcvhdrq_cnt;
u8 ireg; /* clear interrupt register */
/* receive packet sequence counter */
u8 seq_cnt;
/* size of each of the rcvhdrq entries */
u16 rcvhdrqentsize;
u8 rcvhdrqentsize;
/* offset of RHF within receive header entry */
u8 rhf_offset;
/* dynamic receive available interrupt timeout */
u8 rcvavail_timeout;
/* Indicates that this is vnic context */
bool is_vnic;
/* vnic queue index this context is mapped to */
u8 vnic_q_idx;
/* Is ASPM interrupt supported for this context */
bool aspm_intr_supported;
/* ASPM state (enabled/disabled) for this context */
bool aspm_enabled;
/* Is ASPM processing enabled for this context (in intr context) */
bool aspm_intr_enable;
struct ctxt_eager_bufs egrbufs;
/* QPs waiting for context processing */
struct list_head qp_wait_list;
/* tid allocation lists */
struct exp_tid_set tid_group_list;
struct exp_tid_set tid_used_list;
struct exp_tid_set tid_full_list;
/* Timer for re-enabling ASPM if interrupt activity quiets down */
struct timer_list aspm_timer;
/* per-context configuration flags */
unsigned long flags;
/* array of tid_groups */
struct tid_group *groups;
/* mmap of hdrq, must fit in 44 bits */
dma_addr_t rcvhdrq_dma;
dma_addr_t rcvhdrqtailaddr_dma;
struct ctxt_eager_bufs egrbufs;
/* this receive context's assigned PIO ACK send context */
struct send_context *sc;
/* dynamic receive available interrupt timeout */
u32 rcvavail_timeout;
/* Last interrupt timestamp */
ktime_t aspm_ts_last_intr;
/* Last timestamp at which we scheduled a timer for this context */
ktime_t aspm_ts_timer_sched;
/* Lock to serialize between intr, timer intr and user threads */
spinlock_t aspm_lock;
/* Reference count the base context usage */
struct kref kref;
/* Device context index */
u16 ctxt;
/*
* non-zero if ctxt can be shared, and defines the maximum number of
* sub-contexts for this device context.
*/
u16 subctxt_cnt;
/* non-zero if ctxt is being shared. */
u16 subctxt_id;
u8 uuid[16];
/* numa node of this context */
int numa_id;
/* associated msix interrupt. */
s16 msix_intr;
/* job key */
u16 jkey;
/* number of RcvArray groups for this context. */
@ -238,87 +279,59 @@ struct hfi1_ctxtdata {
u16 expected_count;
/* index of first expected TID entry. */
u16 expected_base;
/* array of tid_groups */
struct tid_group *groups;
/* Device context index */
u8 ctxt;
struct exp_tid_set tid_group_list;
struct exp_tid_set tid_used_list;
struct exp_tid_set tid_full_list;
/* lock protecting all Expected TID data of user contexts */
/* PSM Specific fields */
/* lock protecting all Expected TID data */
struct mutex exp_mutex;
/* per-context configuration flags */
unsigned long flags;
/* per-context event flags for fileops/intr communication */
unsigned long event_flags;
/* total number of polled urgent packets */
u32 urgent;
/* saved total number of polled urgent packets for poll edge trigger */
u32 urgent_poll;
/* when waiting for rcv or pioavail */
wait_queue_head_t wait;
/* uuid from PSM */
u8 uuid[16];
/* same size as task_struct .comm[], command that opened context */
char comm[TASK_COMM_LEN];
/* so file ops can get at unit */
struct hfi1_devdata *dd;
/* so functions that need physical port can get it easily */
struct hfi1_pportdata *ppd;
/* associated msix interrupt */
u32 msix_intr;
/* Bitmask of in use context(s) */
DECLARE_BITMAP(in_use_ctxts, HFI1_MAX_SHARED_CTXTS);
/* per-context event flags for fileops/intr communication */
unsigned long event_flags;
/* A page of memory for rcvhdrhead, rcvegrhead, rcvegrtail * N */
void *subctxt_uregbase;
/* An array of pages for the eager receive buffers * N */
void *subctxt_rcvegrbuf;
/* An array of pages for the eager header queue entries * N */
void *subctxt_rcvhdr_base;
/* Bitmask of in use context(s) */
DECLARE_BITMAP(in_use_ctxts, HFI1_MAX_SHARED_CTXTS);
/* The version of the library which opened this ctxt */
u32 userversion;
/* total number of polled urgent packets */
u32 urgent;
/* saved total number of polled urgent packets for poll edge trigger */
u32 urgent_poll;
/* Type of packets or conditions we want to poll for */
u16 poll_type;
/* receive packet sequence counter */
u8 seq_cnt;
/* ctxt rcvhdrq head offset */
u32 head;
/* QPs waiting for context processing */
struct list_head qp_wait_list;
/* interrupt handling */
u64 imask; /* clear interrupt mask */
int ireg; /* clear interrupt register */
int numa_id; /* numa node of this context */
/* verbs rx_stats per rcd */
struct hfi1_opcode_stats_perctx *opstats;
/* Is ASPM interrupt supported for this context */
bool aspm_intr_supported;
/* ASPM state (enabled/disabled) for this context */
bool aspm_enabled;
/* Timer for re-enabling ASPM if interrupt activity quietens down */
struct timer_list aspm_timer;
/* Lock to serialize between intr, timer intr and user threads */
spinlock_t aspm_lock;
/* Is ASPM processing enabled for this context (in intr context) */
bool aspm_intr_enable;
/* Last interrupt timestamp */
ktime_t aspm_ts_last_intr;
/* Last timestamp at which we scheduled a timer for this context */
ktime_t aspm_ts_timer_sched;
/* non-zero if ctxt is being shared. */
u16 subctxt_id;
/* The version of the library which opened this ctxt */
u32 userversion;
/*
* The interrupt handler for a particular receive context can vary
* throughout it's lifetime. This is not a lock protected data member so
* it must be updated atomically and the prev and new value must always
* be valid. Worst case is we process an extra interrupt and up to 64
* packets with the wrong interrupt handler.
* non-zero if ctxt can be shared, and defines the maximum number of
* sub-contexts for this device context.
*/
int (*do_interrupt)(struct hfi1_ctxtdata *rcd, int threaded);
u8 subctxt_cnt;
/* Indicates that this is vnic context */
bool is_vnic;
/* vnic queue index this context is mapped to */
u8 vnic_q_idx;
};
/**
* rcvhdrq_size - return total size in bytes for header queue
* @rcd: the receive context
*
* rcvhdrqentsize is in DWs, so we have to convert to bytes
*
*/
static inline u32 rcvhdrq_size(struct hfi1_ctxtdata *rcd)
{
return PAGE_ALIGN(rcd->rcvhdrq_cnt *
rcd->rcvhdrqentsize * sizeof(u32));
}
/*
* Represents a single packet at a high level. Put commonly computed things in
* here so we do not have to keep doing them over and over. The rule of thumb is
@ -897,12 +910,11 @@ struct hfi1_pportdata {
u64 vl_xmit_flit_cnt[C_VL_COUNT + 1];
};
typedef int (*rhf_rcv_function_ptr)(struct hfi1_packet *packet);
typedef void (*opcode_handler)(struct hfi1_packet *packet);
typedef void (*hfi1_make_req)(struct rvt_qp *qp,
struct hfi1_pkt_state *ps,
struct rvt_swqe *wqe);
extern const rhf_rcv_function_ptr normal_rhf_rcv_functions[];
/* return values for the RHF receive functions */
@ -1046,8 +1058,6 @@ struct hfi1_devdata {
dma_addr_t sdma_pad_phys;
/* for deallocation */
size_t sdma_heads_size;
/* number from the chip */
u32 chip_sdma_engines;
/* num used */
u32 num_sdma;
/* array of engines sized by num_sdma */
@ -1102,8 +1112,6 @@ struct hfi1_devdata {
/* base receive interrupt timeout, in CSR units */
u32 rcv_intr_timeout_csr;
u32 freezelen; /* max length of freezemsg */
u64 __iomem *egrtidbase;
spinlock_t sendctrl_lock; /* protect changes to SendCtrl */
spinlock_t rcvctrl_lock; /* protect changes to RcvCtrl */
spinlock_t uctxt_lock; /* protect rcd changes */
@ -1130,25 +1138,6 @@ struct hfi1_devdata {
/* Base GUID for device (network order) */
u64 base_guid;
/* these are the "32 bit" regs */
/* value we put in kr_rcvhdrsize */
u32 rcvhdrsize;
/* number of receive contexts the chip supports */
u32 chip_rcv_contexts;
/* number of receive array entries */
u32 chip_rcv_array_count;
/* number of PIO send contexts the chip supports */
u32 chip_send_contexts;
/* number of bytes in the PIO memory buffer */
u32 chip_pio_mem_size;
/* number of bytes in the SDMA memory buffer */
u32 chip_sdma_mem_size;
/* size of each rcvegrbuffer */
u32 rcvegrbufsize;
/* log2 of above */
u16 rcvegrbufsize_shift;
/* both sides of the PCIe link are gen3 capable */
u8 link_gen3_capable;
u8 dc_shutdown;
@ -1221,9 +1210,6 @@ struct hfi1_devdata {
u32 num_msix_entries;
u32 first_dyn_msix_idx;
/* INTx information */
u32 requested_intx_irq; /* did we request one? */
/* general interrupt: mask of handled interrupts */
u64 gi_mask[CCE_NUM_INT_CSRS];
@ -1289,8 +1275,6 @@ struct hfi1_devdata {
u64 sw_cce_err_status_aggregate;
/* Software counter that aggregates all bypass packet rcv errors */
u64 sw_rcv_bypass_packet_errors;
/* receive interrupt function */
rhf_rcv_function_ptr normal_rhf_rcv_functions[8];
/* Save the enabled LCB error bits */
u64 lcb_err_en;
@ -1329,10 +1313,7 @@ struct hfi1_devdata {
/* seqlock for sc2vl */
seqlock_t sc2vl_lock ____cacheline_aligned_in_smp;
u64 sc2vl[4];
/* receive interrupt functions */
rhf_rcv_function_ptr *rhf_rcv_function_map;
u64 __percpu *rcv_limit;
u16 rhf_offset; /* offset of RHF within receive header entry */
/* adding a new field here would make it part of this cacheline */
/* OUI comes from the HW. Used everywhere as 3 separate bytes. */
@ -1471,7 +1452,7 @@ void hfi1_make_ud_req_16B(struct rvt_qp *qp,
/* calculate the current RHF address */
static inline __le32 *get_rhf_addr(struct hfi1_ctxtdata *rcd)
{
return (__le32 *)rcd->rcvhdrq + rcd->head + rcd->dd->rhf_offset;
return (__le32 *)rcd->rcvhdrq + rcd->head + rcd->rhf_offset;
}
int hfi1_reset_device(int);
@ -2021,12 +2002,6 @@ static inline void flush_wc(void)
}
void handle_eflags(struct hfi1_packet *packet);
int process_receive_ib(struct hfi1_packet *packet);
int process_receive_bypass(struct hfi1_packet *packet);
int process_receive_error(struct hfi1_packet *packet);
int kdeth_process_expected(struct hfi1_packet *packet);
int kdeth_process_eager(struct hfi1_packet *packet);
int process_receive_invalid(struct hfi1_packet *packet);
void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd);
/* global module parameter variables */

View File

@ -364,9 +364,9 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
hfi1_exp_tid_group_init(rcd);
rcd->ppd = ppd;
rcd->dd = dd;
__set_bit(0, rcd->in_use_ctxts);
rcd->numa_id = numa;
rcd->rcv_array_groups = dd->rcv_entries.ngroups;
rcd->rhf_rcv_function_map = normal_rhf_rcv_functions;
mutex_init(&rcd->exp_mutex);
@ -404,6 +404,8 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
rcd->rcvhdrq_cnt = rcvhdrcnt;
rcd->rcvhdrqentsize = hfi1_hdrq_entsize;
rcd->rhf_offset =
rcd->rcvhdrqentsize - sizeof(u64) / sizeof(u32);
/*
* Simple Eager buffer allocation: we have already pre-allocated
* the number of RcvArray entry groups. Each ctxtdata structure
@ -853,24 +855,6 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
struct hfi1_ctxtdata *rcd;
struct hfi1_pportdata *ppd;
/* Set up recv low level handlers */
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_EXPECTED] =
kdeth_process_expected;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_EAGER] =
kdeth_process_eager;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_IB] = process_receive_ib;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_ERROR] =
process_receive_error;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_BYPASS] =
process_receive_bypass;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_INVALID5] =
process_receive_invalid;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_INVALID6] =
process_receive_invalid;
dd->normal_rhf_rcv_functions[RHF_RCV_TYPE_INVALID7] =
process_receive_invalid;
dd->rhf_rcv_function_map = dd->normal_rhf_rcv_functions;
/* Set up send low level handlers */
dd->process_pio_send = hfi1_verbs_send_pio;
dd->process_dma_send = hfi1_verbs_send_dma;
@ -936,7 +920,7 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
}
/* Allocate enough memory for user event notification. */
len = PAGE_ALIGN(dd->chip_rcv_contexts * HFI1_MAX_SHARED_CTXTS *
len = PAGE_ALIGN(chip_rcv_contexts(dd) * HFI1_MAX_SHARED_CTXTS *
sizeof(*dd->events));
dd->events = vmalloc_user(len);
if (!dd->events)
@ -948,9 +932,6 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
dd->status = vmalloc_user(PAGE_SIZE);
if (!dd->status)
dd_dev_err(dd, "Failed to allocate dev status page\n");
else
dd->freezelen = PAGE_SIZE - (sizeof(*dd->status) -
sizeof(dd->status->freezemsg));
for (pidx = 0; pidx < dd->num_pports; ++pidx) {
ppd = dd->pport + pidx;
if (dd->status)
@ -1144,7 +1125,7 @@ void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
return;
if (rcd->rcvhdrq) {
dma_free_coherent(&dd->pcidev->dev, rcd->rcvhdrq_size,
dma_free_coherent(&dd->pcidev->dev, rcvhdrq_size(rcd),
rcd->rcvhdrq, rcd->rcvhdrq_dma);
rcd->rcvhdrq = NULL;
if (rcd->rcvhdrtail_kvaddr) {
@ -1855,12 +1836,7 @@ int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
if (!rcd->rcvhdrq) {
gfp_t gfp_flags;
/*
* rcvhdrqentsize is in DWs, so we have to convert to bytes
* (* sizeof(u32)).
*/
amt = PAGE_ALIGN(rcd->rcvhdrq_cnt * rcd->rcvhdrqentsize *
sizeof(u32));
amt = rcvhdrq_size(rcd);
if (rcd->ctxt < dd->first_dyn_alloc_ctxt || rcd->is_vnic)
gfp_flags = GFP_KERNEL;
@ -1885,8 +1861,6 @@ int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
if (!rcd->rcvhdrtail_kvaddr)
goto bail_free;
}
rcd->rcvhdrq_size = amt;
}
/*
* These values are per-context:
@ -1902,7 +1876,7 @@ int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
& RCV_HDR_ENT_SIZE_ENT_SIZE_MASK)
<< RCV_HDR_ENT_SIZE_ENT_SIZE_SHIFT;
write_kctxt_csr(dd, rcd->ctxt, RCV_HDR_ENT_SIZE, reg);
reg = (dd->rcvhdrsize & RCV_HDR_SIZE_HDR_SIZE_MASK)
reg = ((u64)DEFAULT_RCVHDRSIZE & RCV_HDR_SIZE_HDR_SIZE_MASK)
<< RCV_HDR_SIZE_HDR_SIZE_SHIFT;
write_kctxt_csr(dd, rcd->ctxt, RCV_HDR_SIZE, reg);
@ -1938,9 +1912,9 @@ bail:
int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd)
{
struct hfi1_devdata *dd = rcd->dd;
u32 max_entries, egrtop, alloced_bytes = 0, idx = 0;
u32 max_entries, egrtop, alloced_bytes = 0;
gfp_t gfp_flags;
u16 order;
u16 order, idx = 0;
int ret = 0;
u16 round_mtu = roundup_pow_of_two(hfi1_max_mtu);

View File

@ -157,6 +157,7 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
unsigned long len;
resource_size_t addr;
int ret = 0;
u32 rcv_array_count;
addr = pci_resource_start(pdev, 0);
len = pci_resource_len(pdev, 0);
@ -186,9 +187,9 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
goto nomem;
}
dd->chip_rcv_array_count = readq(dd->kregbase1 + RCV_ARRAY_CNT);
dd_dev_info(dd, "RcvArray count: %u\n", dd->chip_rcv_array_count);
dd->base2_start = RCV_ARRAY + dd->chip_rcv_array_count * 8;
rcv_array_count = readq(dd->kregbase1 + RCV_ARRAY_CNT);
dd_dev_info(dd, "RcvArray count: %u\n", rcv_array_count);
dd->base2_start = RCV_ARRAY + rcv_array_count * 8;
dd->kregbase2 = ioremap_nocache(
addr + dd->base2_start,
@ -214,13 +215,13 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
* to write an entire cacheline worth of entries in one shot.
*/
dd->rcvarray_wc = ioremap_wc(addr + RCV_ARRAY,
dd->chip_rcv_array_count * 8);
rcv_array_count * 8);
if (!dd->rcvarray_wc) {
dd_dev_err(dd, "WC mapping of receive array failed\n");
goto nomem;
}
dd_dev_info(dd, "WC RcvArray: %p for %x\n",
dd->rcvarray_wc, dd->chip_rcv_array_count * 8);
dd->rcvarray_wc, rcv_array_count * 8);
dd->flags |= HFI1_PRESENT; /* chip.c CSR routines now work */
return 0;
@ -346,15 +347,13 @@ int pcie_speeds(struct hfi1_devdata *dd)
/*
* Returns:
* - actual number of interrupts allocated or
* - 0 if fell back to INTx.
* - error
*/
int request_msix(struct hfi1_devdata *dd, u32 msireq)
{
int nvec;
nvec = pci_alloc_irq_vectors(dd->pcidev, 1, msireq,
PCI_IRQ_MSIX | PCI_IRQ_LEGACY);
nvec = pci_alloc_irq_vectors(dd->pcidev, msireq, msireq, PCI_IRQ_MSIX);
if (nvec < 0) {
dd_dev_err(dd, "pci_alloc_irq_vectors() failed: %d\n", nvec);
return nvec;
@ -362,10 +361,6 @@ int request_msix(struct hfi1_devdata *dd, u32 msireq)
tune_pcie_caps(dd);
/* check for legacy IRQ */
if (nvec == 1 && !dd->pcidev->msix_enabled)
return 0;
return nvec;
}

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015-2017 Intel Corporation.
* Copyright(c) 2015-2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -226,7 +226,7 @@ static const char *sc_type_name(int index)
int init_sc_pools_and_sizes(struct hfi1_devdata *dd)
{
struct mem_pool_info mem_pool_info[NUM_SC_POOLS] = { { 0 } };
int total_blocks = (dd->chip_pio_mem_size / PIO_BLOCK_SIZE) - 1;
int total_blocks = (chip_pio_mem_size(dd) / PIO_BLOCK_SIZE) - 1;
int total_contexts = 0;
int fixed_blocks;
int pool_blocks;
@ -343,8 +343,8 @@ int init_sc_pools_and_sizes(struct hfi1_devdata *dd)
sc_type_name(i), count);
return -EINVAL;
}
if (total_contexts + count > dd->chip_send_contexts)
count = dd->chip_send_contexts - total_contexts;
if (total_contexts + count > chip_send_contexts(dd))
count = chip_send_contexts(dd) - total_contexts;
total_contexts += count;
@ -507,7 +507,7 @@ static int sc_hw_alloc(struct hfi1_devdata *dd, int type, u32 *sw_index,
if (sci->type == type && sci->allocated == 0) {
sci->allocated = 1;
/* use a 1:1 mapping, but make them non-equal */
context = dd->chip_send_contexts - index - 1;
context = chip_send_contexts(dd) - index - 1;
dd->hw_to_sw[context] = index;
*sw_index = index;
*hw_context = context;
@ -1618,11 +1618,11 @@ static void sc_piobufavail(struct send_context *sc)
/* Wake up the most starved one first */
if (n)
hfi1_qp_wakeup(qps[max_idx],
RVT_S_WAIT_PIO | RVT_S_WAIT_PIO_DRAIN);
RVT_S_WAIT_PIO | HFI1_S_WAIT_PIO_DRAIN);
for (i = 0; i < n; i++)
if (i != max_idx)
hfi1_qp_wakeup(qps[i],
RVT_S_WAIT_PIO | RVT_S_WAIT_PIO_DRAIN);
RVT_S_WAIT_PIO | HFI1_S_WAIT_PIO_DRAIN);
}
/* translate a send credit update to a bit code of reasons */

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015 - 2017 Intel Corporation.
* Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -273,7 +273,7 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
if (attr_mask & IB_QP_PATH_MIG_STATE &&
attr->path_mig_state == IB_MIG_MIGRATED &&
qp->s_mig_state == IB_MIG_ARMED) {
qp->s_flags |= RVT_S_AHG_CLEAR;
qp->s_flags |= HFI1_S_AHG_CLEAR;
priv->s_sc = ah_to_sc(ibqp->device, &qp->remote_ah_attr);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
@ -717,7 +717,7 @@ void hfi1_migrate_qp(struct rvt_qp *qp)
qp->remote_ah_attr = qp->alt_ah_attr;
qp->port_num = rdma_ah_get_port_num(&qp->alt_ah_attr);
qp->s_pkey_index = qp->s_alt_pkey_index;
qp->s_flags |= RVT_S_AHG_CLEAR;
qp->s_flags |= HFI1_S_AHG_CLEAR;
priv->s_sc = ah_to_sc(qp->ibqp.device, &qp->remote_ah_attr);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
qp_set_16b(qp);

View File

@ -1,7 +1,7 @@
#ifndef _QP_H
#define _QP_H
/*
* Copyright(c) 2015 - 2017 Intel Corporation.
* Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -69,6 +69,26 @@ static inline int hfi1_send_ok(struct rvt_qp *qp)
!(qp->s_flags & RVT_S_ANY_WAIT_SEND));
}
/*
* Driver specific s_flags starting at bit 31 down to HFI1_S_MIN_BIT_MASK
*
* HFI1_S_AHG_VALID - ahg header valid on chip
* HFI1_S_AHG_CLEAR - have send engine clear ahg state
* HFI1_S_WAIT_PIO_DRAIN - qp waiting for PIOs to drain
* HFI1_S_MIN_BIT_MASK - the lowest bit that can be used by hfi1
*/
#define HFI1_S_AHG_VALID 0x80000000
#define HFI1_S_AHG_CLEAR 0x40000000
#define HFI1_S_WAIT_PIO_DRAIN 0x20000000
#define HFI1_S_MIN_BIT_MASK 0x01000000
/*
* overload wait defines
*/
#define HFI1_S_ANY_WAIT_IO (RVT_S_ANY_WAIT_IO | HFI1_S_WAIT_PIO_DRAIN)
#define HFI1_S_ANY_WAIT (HFI1_S_ANY_WAIT_IO | RVT_S_ANY_WAIT_SEND)
/*
* free_ahg - clear ahg from QP
*/
@ -77,7 +97,7 @@ static inline void clear_ahg(struct rvt_qp *qp)
struct hfi1_qp_priv *priv = qp->priv;
priv->s_ahg->ahgcount = 0;
qp->s_flags &= ~(RVT_S_AHG_VALID | RVT_S_AHG_CLEAR);
qp->s_flags &= ~(HFI1_S_AHG_VALID | HFI1_S_AHG_CLEAR);
if (priv->s_sde && qp->s_ahgidx >= 0)
sdma_ahg_free(priv->s_sde, qp->s_ahgidx);
qp->s_ahgidx = -1;

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015, 2016 Intel Corporation.
* Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -241,7 +241,7 @@ bail:
smp_wmb();
qp->s_flags &= ~(RVT_S_RESP_PENDING
| RVT_S_ACK_PENDING
| RVT_S_AHG_VALID);
| HFI1_S_AHG_VALID);
return 0;
}
@ -1024,7 +1024,7 @@ done:
if ((cmp_psn(qp->s_psn, qp->s_sending_hpsn) <= 0) &&
(cmp_psn(qp->s_sending_psn, qp->s_sending_hpsn) <= 0))
qp->s_flags |= RVT_S_WAIT_PSN;
qp->s_flags &= ~RVT_S_AHG_VALID;
qp->s_flags &= ~HFI1_S_AHG_VALID;
}
/*

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015 - 2017 Intel Corporation.
* Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -194,7 +194,7 @@ static void ruc_loopback(struct rvt_qp *sqp)
spin_lock_irqsave(&sqp->s_lock, flags);
/* Return if we are already busy processing a work request. */
if ((sqp->s_flags & (RVT_S_BUSY | RVT_S_ANY_WAIT)) ||
if ((sqp->s_flags & (RVT_S_BUSY | HFI1_S_ANY_WAIT)) ||
!(ib_rvt_state_ops[sqp->state] & RVT_PROCESS_OR_FLUSH_SEND))
goto unlock;
@ -533,9 +533,9 @@ static inline void build_ahg(struct rvt_qp *qp, u32 npsn)
{
struct hfi1_qp_priv *priv = qp->priv;
if (unlikely(qp->s_flags & RVT_S_AHG_CLEAR))
if (unlikely(qp->s_flags & HFI1_S_AHG_CLEAR))
clear_ahg(qp);
if (!(qp->s_flags & RVT_S_AHG_VALID)) {
if (!(qp->s_flags & HFI1_S_AHG_VALID)) {
/* first middle that needs copy */
if (qp->s_ahgidx < 0)
qp->s_ahgidx = sdma_ahg_alloc(priv->s_sde);
@ -544,7 +544,7 @@ static inline void build_ahg(struct rvt_qp *qp, u32 npsn)
priv->s_ahg->tx_flags |= SDMA_TXREQ_F_AHG_COPY;
/* save to protect a change in another thread */
priv->s_ahg->ahgidx = qp->s_ahgidx;
qp->s_flags |= RVT_S_AHG_VALID;
qp->s_flags |= HFI1_S_AHG_VALID;
}
} else {
/* subsequent middle after valid */
@ -650,7 +650,7 @@ static inline void hfi1_make_ruc_header_16B(struct rvt_qp *qp,
if (middle)
build_ahg(qp, bth2);
else
qp->s_flags &= ~RVT_S_AHG_VALID;
qp->s_flags &= ~HFI1_S_AHG_VALID;
bth0 |= pkey;
bth0 |= extra_bytes << 20;
@ -727,7 +727,7 @@ static inline void hfi1_make_ruc_header_9B(struct rvt_qp *qp,
if (middle)
build_ahg(qp, bth2);
else
qp->s_flags &= ~RVT_S_AHG_VALID;
qp->s_flags &= ~HFI1_S_AHG_VALID;
bth0 |= pkey;
bth0 |= extra_bytes << 20;

View File

@ -1351,7 +1351,7 @@ int sdma_init(struct hfi1_devdata *dd, u8 port)
struct hfi1_pportdata *ppd = dd->pport + port;
u32 per_sdma_credits;
uint idle_cnt = sdma_idle_cnt;
size_t num_engines = dd->chip_sdma_engines;
size_t num_engines = chip_sdma_engines(dd);
int ret = -ENOMEM;
if (!HFI1_CAP_IS_KSET(SDMA)) {
@ -1360,18 +1360,18 @@ int sdma_init(struct hfi1_devdata *dd, u8 port)
}
if (mod_num_sdma &&
/* can't exceed chip support */
mod_num_sdma <= dd->chip_sdma_engines &&
mod_num_sdma <= chip_sdma_engines(dd) &&
/* count must be >= vls */
mod_num_sdma >= num_vls)
num_engines = mod_num_sdma;
dd_dev_info(dd, "SDMA mod_num_sdma: %u\n", mod_num_sdma);
dd_dev_info(dd, "SDMA chip_sdma_engines: %u\n", dd->chip_sdma_engines);
dd_dev_info(dd, "SDMA chip_sdma_engines: %u\n", chip_sdma_engines(dd));
dd_dev_info(dd, "SDMA chip_sdma_mem_size: %u\n",
dd->chip_sdma_mem_size);
chip_sdma_mem_size(dd));
per_sdma_credits =
dd->chip_sdma_mem_size / (num_engines * SDMA_BLOCK_SIZE);
chip_sdma_mem_size(dd) / (num_engines * SDMA_BLOCK_SIZE);
/* set up freeze waitqueue */
init_waitqueue_head(&dd->sdma_unfreeze_wq);

View File

@ -1007,7 +1007,7 @@ static int pio_wait(struct rvt_qp *qp,
int was_empty;
dev->n_piowait += !!(flag & RVT_S_WAIT_PIO);
dev->n_piodrain += !!(flag & RVT_S_WAIT_PIO_DRAIN);
dev->n_piodrain += !!(flag & HFI1_S_WAIT_PIO_DRAIN);
qp->s_flags |= flag;
was_empty = list_empty(&sc->piowait);
iowait_queue(ps->pkts_sent, &priv->s_iowait,
@ -1376,7 +1376,7 @@ int hfi1_verbs_send(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
return pio_wait(qp,
ps->s_txreq->psc,
ps,
RVT_S_WAIT_PIO_DRAIN);
HFI1_S_WAIT_PIO_DRAIN);
return sr(qp, ps, 0);
}
@ -1410,7 +1410,8 @@ static void hfi1_fill_device_attr(struct hfi1_devdata *dd)
rdi->dparms.props.max_fast_reg_page_list_len = UINT_MAX;
rdi->dparms.props.max_qp = hfi1_max_qps;
rdi->dparms.props.max_qp_wr = hfi1_max_qp_wrs;
rdi->dparms.props.max_sge = hfi1_max_sges;
rdi->dparms.props.max_send_sge = hfi1_max_sges;
rdi->dparms.props.max_recv_sge = hfi1_max_sges;
rdi->dparms.props.max_sge_rd = hfi1_max_sges;
rdi->dparms.props.max_cq = hfi1_max_cqs;
rdi->dparms.props.max_ah = hfi1_max_ahs;
@ -1497,15 +1498,6 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
props->active_mtu = !valid_ib_mtu(ppd->ibmtu) ? props->max_mtu :
mtu_to_enum(ppd->ibmtu, IB_MTU_4096);
/*
* sm_lid of 0xFFFF needs special handling so that it can
* be differentiated from a permissve LID of 0xFFFF.
* We set the grh_required flag here so the SA can program
* the DGID in the address handle appropriately
*/
if (props->sm_lid == be16_to_cpu(IB_LID_PERMISSIVE))
props->grh_required = true;
return 0;
}
@ -1892,7 +1884,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
ibdev->process_mad = hfi1_process_mad;
ibdev->get_dev_fw_str = hfi1_get_dev_fw_str;
strncpy(ibdev->node_desc, init_utsname()->nodename,
strlcpy(ibdev->node_desc, init_utsname()->nodename,
sizeof(ibdev->node_desc));
/*

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2017 Intel Corporation.
* Copyright(c) 2017 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -120,7 +120,6 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
uctxt->seq_cnt = 1;
uctxt->is_vnic = true;
if (dd->num_msix_entries)
hfi1_set_vnic_msix_info(uctxt);
hfi1_stats.sps_ctxts++;
@ -136,7 +135,6 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
dd_dev_dbg(dd, "closing vnic context %d\n", uctxt->ctxt);
flush_wc();
if (dd->num_msix_entries)
hfi1_reset_vnic_msix_info(uctxt);
/*
@ -818,14 +816,14 @@ struct net_device *hfi1_vnic_alloc_rn(struct ib_device *device,
size = sizeof(struct opa_vnic_rdma_netdev) + sizeof(*vinfo);
netdev = alloc_netdev_mqs(size, name, name_assign_type, setup,
dd->chip_sdma_engines, dd->num_vnic_contexts);
chip_sdma_engines(dd), dd->num_vnic_contexts);
if (!netdev)
return ERR_PTR(-ENOMEM);
rn = netdev_priv(netdev);
vinfo = opa_vnic_dev_priv(netdev);
vinfo->dd = dd;
vinfo->num_tx_q = dd->chip_sdma_engines;
vinfo->num_tx_q = chip_sdma_engines(dd);
vinfo->num_rx_q = dd->num_vnic_contexts;
vinfo->netdev = netdev;
rn->free_rdma_netdev = hfi1_vnic_free_rn;

View File

@ -44,13 +44,11 @@ struct ib_ah *hns_roce_create_ah(struct ib_pd *ibpd,
struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibpd->device);
const struct ib_gid_attr *gid_attr;
struct device *dev = hr_dev->dev;
struct ib_gid_attr gid_attr;
struct hns_roce_ah *ah;
u16 vlan_tag = 0xffff;
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
union ib_gid sgid;
int ret;
ah = kzalloc(sizeof(*ah), GFP_ATOMIC);
if (!ah)
@ -59,18 +57,9 @@ struct ib_ah *hns_roce_create_ah(struct ib_pd *ibpd,
/* Get mac address */
memcpy(ah->av.mac, ah_attr->roce.dmac, ETH_ALEN);
/* Get source gid */
ret = ib_get_cached_gid(ibpd->device, rdma_ah_get_port_num(ah_attr),
grh->sgid_index, &sgid, &gid_attr);
if (ret) {
dev_err(dev, "get sgid failed! ret = %d\n", ret);
kfree(ah);
return ERR_PTR(ret);
}
if (is_vlan_dev(gid_attr.ndev))
vlan_tag = vlan_dev_vlan_id(gid_attr.ndev);
dev_put(gid_attr.ndev);
gid_attr = ah_attr->grh.sgid_attr;
if (is_vlan_dev(gid_attr->ndev))
vlan_tag = vlan_dev_vlan_id(gid_attr->ndev);
if (vlan_tag < 0x1000)
vlan_tag |= (rdma_ah_get_sl(ah_attr) &
@ -108,7 +97,7 @@ int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
rdma_ah_set_static_rate(ah_attr, ah->av.stat_rate);
rdma_ah_set_grh(ah_attr, NULL,
(le32_to_cpu(ah->av.sl_tclass_flowlabel) &
HNS_ROCE_FLOW_LABLE_MASK), ah->av.gid_index,
HNS_ROCE_FLOW_LABEL_MASK), ah->av.gid_index,
ah->av.hop_limit,
(le32_to_cpu(ah->av.sl_tclass_flowlabel) >>
HNS_ROCE_TCLASS_SHIFT));

View File

@ -382,15 +382,6 @@
#define ROCEE_VF_EQ_DB_CFG0_REG 0x238
#define ROCEE_VF_EQ_DB_CFG1_REG 0x23C
#define ROCEE_VF_SMAC_CFG0_REG 0x12000
#define ROCEE_VF_SMAC_CFG1_REG 0x12004
#define ROCEE_VF_SGID_CFG0_REG 0x10000
#define ROCEE_VF_SGID_CFG1_REG 0x10004
#define ROCEE_VF_SGID_CFG2_REG 0x10008
#define ROCEE_VF_SGID_CFG3_REG 0x1000c
#define ROCEE_VF_SGID_CFG4_REG 0x10010
#define ROCEE_VF_ABN_INT_CFG_REG 0x13000
#define ROCEE_VF_ABN_INT_ST_REG 0x13004
#define ROCEE_VF_ABN_INT_EN_REG 0x13008

View File

@ -41,6 +41,8 @@ int hns_roce_db_map_user(struct hns_roce_ucontext *context, unsigned long virt,
found:
db->dma = sg_dma_address(page->umem->sg_head.sgl) +
(virt & ~PAGE_MASK);
page->umem->sg_head.sgl->offset = virt & ~PAGE_MASK;
db->virt_addr = sg_virt(page->umem->sg_head.sgl);
db->u.user_page = page;
refcount_inc(&page->refcount);

View File

@ -76,7 +76,7 @@
/* 4G/4K = 1M */
#define HNS_ROCE_SL_SHIFT 28
#define HNS_ROCE_TCLASS_SHIFT 20
#define HNS_ROCE_FLOW_LABLE_MASK 0xfffff
#define HNS_ROCE_FLOW_LABEL_MASK 0xfffff
#define HNS_ROCE_MAX_PORTS 6
#define HNS_ROCE_MAX_GID_NUM 16
@ -110,6 +110,7 @@
enum {
HNS_ROCE_SUPPORT_RQ_RECORD_DB = 1 << 0,
HNS_ROCE_SUPPORT_SQ_RECORD_DB = 1 << 1,
};
enum {
@ -190,7 +191,8 @@ enum {
HNS_ROCE_CAP_FLAG_REREG_MR = BIT(0),
HNS_ROCE_CAP_FLAG_ROCE_V1_V2 = BIT(1),
HNS_ROCE_CAP_FLAG_RQ_INLINE = BIT(2),
HNS_ROCE_CAP_FLAG_RECORD_DB = BIT(3)
HNS_ROCE_CAP_FLAG_RECORD_DB = BIT(3),
HNS_ROCE_CAP_FLAG_SQ_RECORD_DB = BIT(4),
};
enum hns_roce_mtt_type {
@ -385,6 +387,7 @@ struct hns_roce_db {
struct hns_roce_user_db_page *user_page;
} u;
dma_addr_t dma;
void *virt_addr;
int index;
int order;
};
@ -524,7 +527,9 @@ struct hns_roce_qp {
struct hns_roce_buf hr_buf;
struct hns_roce_wq rq;
struct hns_roce_db rdb;
struct hns_roce_db sdb;
u8 rdb_en;
u8 sdb_en;
u32 doorbell_qpn;
__le32 sq_signal_bits;
u32 sq_next_wqe;
@ -579,22 +584,22 @@ struct hns_roce_ceqe {
};
struct hns_roce_aeqe {
u32 asyn;
__le32 asyn;
union {
struct {
u32 qp;
__le32 qp;
u32 rsv0;
u32 rsv1;
} qp_event;
struct {
u32 cq;
__le32 cq;
u32 rsv0;
u32 rsv1;
} cq_event;
struct {
u32 ceqe;
__le32 ceqe;
u32 rsv0;
u32 rsv1;
} ce_event;
@ -641,6 +646,8 @@ struct hns_roce_eq {
int shift;
dma_addr_t cur_eqe_ba;
dma_addr_t nxt_eqe_ba;
int event_type;
int sub_type;
};
struct hns_roce_eq_table {
@ -720,10 +727,21 @@ struct hns_roce_caps {
u32 eqe_ba_pg_sz;
u32 eqe_buf_pg_sz;
u32 eqe_hop_num;
u32 sl_num;
u32 tsq_buf_pg_sz;
u32 tpq_buf_pg_sz;
u32 chunk_sz; /* chunk size in non multihop mode*/
u64 flags;
};
struct hns_roce_work {
struct hns_roce_dev *hr_dev;
struct work_struct work;
u32 qpn;
int event_type;
int sub_type;
};
struct hns_roce_hw {
int (*reset)(struct hns_roce_dev *hr_dev, bool enable);
int (*cmq_init)(struct hns_roce_dev *hr_dev);
@ -736,7 +754,7 @@ struct hns_roce_hw {
u16 token, int event);
int (*chk_mbox)(struct hns_roce_dev *hr_dev, unsigned long timeout);
int (*set_gid)(struct hns_roce_dev *hr_dev, u8 port, int gid_index,
union ib_gid *gid, const struct ib_gid_attr *attr);
const union ib_gid *gid, const struct ib_gid_attr *attr);
int (*set_mac)(struct hns_roce_dev *hr_dev, u8 phy_port, u8 *addr);
void (*set_mtu)(struct hns_roce_dev *hr_dev, u8 phy_port,
enum ib_mtu mtu);
@ -760,10 +778,10 @@ struct hns_roce_hw {
int attr_mask, enum ib_qp_state cur_state,
enum ib_qp_state new_state);
int (*destroy_qp)(struct ib_qp *ibqp);
int (*post_send)(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr);
int (*post_recv)(struct ib_qp *qp, struct ib_recv_wr *recv_wr,
struct ib_recv_wr **bad_recv_wr);
int (*post_send)(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr);
int (*post_recv)(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
int (*req_notify_cq)(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
int (*poll_cq)(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr);
@ -816,6 +834,7 @@ struct hns_roce_dev {
u32 tptr_size; /*only for hw v1*/
const struct hns_roce_hw *hw;
void *priv;
struct workqueue_struct *irq_workq;
};
static inline struct hns_roce_dev *to_hr_dev(struct ib_device *ib_dev)
@ -864,7 +883,7 @@ static inline struct hns_roce_sqp *hr_to_hr_sqp(struct hns_roce_qp *hr_qp)
return container_of(hr_qp, struct hns_roce_sqp, hr_qp);
}
static inline void hns_roce_write64_k(__be32 val[2], void __iomem *dest)
static inline void hns_roce_write64_k(__le32 val[2], void __iomem *dest)
{
__raw_writeq(*(u64 *) val, dest);
}
@ -982,7 +1001,7 @@ void hns_roce_qp_remove(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp);
void hns_roce_qp_free(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp);
void hns_roce_release_range_qp(struct hns_roce_dev *hr_dev, int base_qpn,
int cnt);
__be32 send_ieth(struct ib_send_wr *wr);
__be32 send_ieth(const struct ib_send_wr *wr);
int to_hr_qp_type(int qp_type);
struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,

View File

@ -170,7 +170,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
case 3:
mhop->l2_idx = table_idx & (chunk_ba_num - 1);
mhop->l1_idx = table_idx / chunk_ba_num & (chunk_ba_num - 1);
mhop->l0_idx = table_idx / chunk_ba_num / chunk_ba_num;
mhop->l0_idx = (table_idx / chunk_ba_num) / chunk_ba_num;
break;
case 2:
mhop->l1_idx = table_idx & (chunk_ba_num - 1);
@ -342,7 +342,7 @@ static int hns_roce_set_hem(struct hns_roce_dev *hr_dev,
} else {
break;
}
msleep(HW_SYNC_SLEEP_TIME_INTERVAL);
mdelay(HW_SYNC_SLEEP_TIME_INTERVAL);
}
bt_cmd_l = (u32)bt_ba;
@ -494,6 +494,9 @@ static int hns_roce_table_mhop_get(struct hns_roce_dev *hr_dev,
step_idx = 1;
} else if (hop_num == HNS_ROCE_HOP_NUM_0) {
step_idx = 0;
} else {
ret = -EINVAL;
goto err_dma_alloc_l1;
}
/* set HEM base address to hardware */

File diff suppressed because it is too large Load Diff

View File

@ -260,7 +260,7 @@ struct hns_roce_cqe {
__le32 cqe_byte_4;
union {
__le32 r_key;
__be32 immediate_data;
__le32 immediate_data;
};
__le32 byte_cnt;
__le32 cqe_byte_16;

File diff suppressed because it is too large Load Diff

View File

@ -112,6 +112,9 @@
(step_idx == 1 && hop_num == 1) || \
(step_idx == 2 && hop_num == 2))
#define CMD_CSQ_DESC_NUM 1024
#define CMD_CRQ_DESC_NUM 1024
enum {
NO_ARMED = 0x0,
REG_NXT_CEQE = 0x2,
@ -203,6 +206,10 @@ enum hns_roce_opcode_type {
HNS_ROCE_OPC_ALLOC_PF_RES = 0x8004,
HNS_ROCE_OPC_QUERY_PF_RES = 0x8400,
HNS_ROCE_OPC_ALLOC_VF_RES = 0x8401,
HNS_ROCE_OPC_CFG_EXT_LLM = 0x8403,
HNS_ROCE_OPC_CFG_TMOUT_LLM = 0x8404,
HNS_ROCE_OPC_CFG_SGID_TB = 0x8500,
HNS_ROCE_OPC_CFG_SMAC_TB = 0x8501,
HNS_ROCE_OPC_CFG_BT_ATTR = 0x8506,
};
@ -447,8 +454,8 @@ struct hns_roce_v2_qp_context {
#define V2_QPC_BYTE_24_TC_S 8
#define V2_QPC_BYTE_24_TC_M GENMASK(15, 8)
#define V2_QPC_BYTE_24_VLAN_IDX_S 16
#define V2_QPC_BYTE_24_VLAN_IDX_M GENMASK(27, 16)
#define V2_QPC_BYTE_24_VLAN_ID_S 16
#define V2_QPC_BYTE_24_VLAN_ID_M GENMASK(27, 16)
#define V2_QPC_BYTE_24_MTU_S 28
#define V2_QPC_BYTE_24_MTU_M GENMASK(31, 28)
@ -768,7 +775,7 @@ struct hns_roce_v2_cqe {
__le32 byte_4;
union {
__le32 rkey;
__be32 immtdata;
__le32 immtdata;
};
__le32 byte_12;
__le32 byte_16;
@ -926,7 +933,7 @@ struct hns_roce_v2_cq_db {
struct hns_roce_v2_ud_send_wqe {
__le32 byte_4;
__le32 msg_len;
__be32 immtdata;
__le32 immtdata;
__le32 byte_16;
__le32 byte_20;
__le32 byte_24;
@ -1012,7 +1019,7 @@ struct hns_roce_v2_rc_send_wqe {
__le32 msg_len;
union {
__le32 inv_key;
__be32 immtdata;
__le32 immtdata;
};
__le32 byte_16;
__le32 byte_20;
@ -1061,6 +1068,40 @@ struct hns_roce_query_version {
__le32 rsv[5];
};
struct hns_roce_cfg_llm_a {
__le32 base_addr_l;
__le32 base_addr_h;
__le32 depth_pgsz_init_en;
__le32 head_ba_l;
__le32 head_ba_h_nxtptr;
__le32 head_ptr;
};
#define CFG_LLM_QUE_DEPTH_S 0
#define CFG_LLM_QUE_DEPTH_M GENMASK(12, 0)
#define CFG_LLM_QUE_PGSZ_S 16
#define CFG_LLM_QUE_PGSZ_M GENMASK(19, 16)
#define CFG_LLM_INIT_EN_S 20
#define CFG_LLM_INIT_EN_M GENMASK(20, 20)
#define CFG_LLM_HEAD_PTR_S 0
#define CFG_LLM_HEAD_PTR_M GENMASK(11, 0)
struct hns_roce_cfg_llm_b {
__le32 tail_ba_l;
__le32 tail_ba_h;
__le32 tail_ptr;
__le32 rsv[3];
};
#define CFG_LLM_TAIL_BA_H_S 0
#define CFG_LLM_TAIL_BA_H_M GENMASK(19, 0)
#define CFG_LLM_TAIL_PTR_S 0
#define CFG_LLM_TAIL_PTR_M GENMASK(11, 0)
struct hns_roce_cfg_global_param {
__le32 time_cfg_udp_port;
__le32 rsv[5];
@ -1072,7 +1113,7 @@ struct hns_roce_cfg_global_param {
#define CFG_GLOBAL_PARAM_DATA_0_ROCEE_UDP_PORT_S 16
#define CFG_GLOBAL_PARAM_DATA_0_ROCEE_UDP_PORT_M GENMASK(31, 16)
struct hns_roce_pf_res {
struct hns_roce_pf_res_a {
__le32 rsv;
__le32 qpc_bt_idx_num;
__le32 srqc_bt_idx_num;
@ -1111,6 +1152,32 @@ struct hns_roce_pf_res {
#define PF_RES_DATA_5_PF_EQC_BT_NUM_S 16
#define PF_RES_DATA_5_PF_EQC_BT_NUM_M GENMASK(25, 16)
struct hns_roce_pf_res_b {
__le32 rsv0;
__le32 smac_idx_num;
__le32 sgid_idx_num;
__le32 qid_idx_sl_num;
__le32 rsv[2];
};
#define PF_RES_DATA_1_PF_SMAC_IDX_S 0
#define PF_RES_DATA_1_PF_SMAC_IDX_M GENMASK(7, 0)
#define PF_RES_DATA_1_PF_SMAC_NUM_S 8
#define PF_RES_DATA_1_PF_SMAC_NUM_M GENMASK(16, 8)
#define PF_RES_DATA_2_PF_SGID_IDX_S 0
#define PF_RES_DATA_2_PF_SGID_IDX_M GENMASK(7, 0)
#define PF_RES_DATA_2_PF_SGID_NUM_S 8
#define PF_RES_DATA_2_PF_SGID_NUM_M GENMASK(16, 8)
#define PF_RES_DATA_3_PF_QID_IDX_S 0
#define PF_RES_DATA_3_PF_QID_IDX_M GENMASK(9, 0)
#define PF_RES_DATA_3_PF_SL_NUM_S 16
#define PF_RES_DATA_3_PF_SL_NUM_M GENMASK(26, 16)
struct hns_roce_vf_res_a {
__le32 vf_id;
__le32 vf_qpc_bt_idx_num;
@ -1179,13 +1246,6 @@ struct hns_roce_vf_res_b {
#define VF_RES_B_DATA_3_VF_SL_NUM_S 16
#define VF_RES_B_DATA_3_VF_SL_NUM_M GENMASK(19, 16)
/* Reg field definition */
#define ROCEE_VF_SMAC_CFG1_VF_SMAC_H_S 0
#define ROCEE_VF_SMAC_CFG1_VF_SMAC_H_M GENMASK(15, 0)
#define ROCEE_VF_SGID_CFG4_SGID_TYPE_S 0
#define ROCEE_VF_SGID_CFG4_SGID_TYPE_M GENMASK(1, 0)
struct hns_roce_cfg_bt_attr {
__le32 vf_qpc_cfg;
__le32 vf_srqc_cfg;
@ -1230,6 +1290,32 @@ struct hns_roce_cfg_bt_attr {
#define CFG_BT_ATTR_DATA_3_VF_MPT_HOPNUM_S 8
#define CFG_BT_ATTR_DATA_3_VF_MPT_HOPNUM_M GENMASK(9, 8)
struct hns_roce_cfg_sgid_tb {
__le32 table_idx_rsv;
__le32 vf_sgid_l;
__le32 vf_sgid_ml;
__le32 vf_sgid_mh;
__le32 vf_sgid_h;
__le32 vf_sgid_type_rsv;
};
#define CFG_SGID_TB_TABLE_IDX_S 0
#define CFG_SGID_TB_TABLE_IDX_M GENMASK(7, 0)
#define CFG_SGID_TB_VF_SGID_TYPE_S 0
#define CFG_SGID_TB_VF_SGID_TYPE_M GENMASK(1, 0)
struct hns_roce_cfg_smac_tb {
__le32 tb_idx_rsv;
__le32 vf_smac_l;
__le32 vf_smac_h_rsv;
__le32 rsv[3];
};
#define CFG_SMAC_TB_IDX_S 0
#define CFG_SMAC_TB_IDX_M GENMASK(7, 0)
#define CFG_SMAC_TB_VF_SMAC_H_S 0
#define CFG_SMAC_TB_VF_SMAC_H_M GENMASK(15, 0)
struct hns_roce_cmq_desc {
__le16 opcode;
__le16 flag;
@ -1276,8 +1362,32 @@ struct hns_roce_v2_cmq {
u16 last_status;
};
enum hns_roce_link_table_type {
TSQ_LINK_TABLE,
TPQ_LINK_TABLE,
};
struct hns_roce_link_table {
struct hns_roce_buf_list table;
struct hns_roce_buf_list *pg_list;
u32 npages;
u32 pg_sz;
};
struct hns_roce_link_table_entry {
u32 blk_ba0;
u32 blk_ba1_nxt_ptr;
};
#define HNS_ROCE_LINK_TABLE_BA1_S 0
#define HNS_ROCE_LINK_TABLE_BA1_M GENMASK(19, 0)
#define HNS_ROCE_LINK_TABLE_NXT_PTR_S 20
#define HNS_ROCE_LINK_TABLE_NXT_PTR_M GENMASK(31, 20)
struct hns_roce_v2_priv {
struct hns_roce_v2_cmq cmq;
struct hns_roce_link_table tsq;
struct hns_roce_link_table tpq;
};
struct hns_roce_eq_context {

View File

@ -74,8 +74,7 @@ static int hns_roce_set_mac(struct hns_roce_dev *hr_dev, u8 port, u8 *addr)
return hr_dev->hw->set_mac(hr_dev, phy_port, addr);
}
static int hns_roce_add_gid(const union ib_gid *gid,
const struct ib_gid_attr *attr, void **context)
static int hns_roce_add_gid(const struct ib_gid_attr *attr, void **context)
{
struct hns_roce_dev *hr_dev = to_hr_dev(attr->device);
u8 port = attr->port_num - 1;
@ -87,8 +86,7 @@ static int hns_roce_add_gid(const union ib_gid *gid,
spin_lock_irqsave(&hr_dev->iboe.lock, flags);
ret = hr_dev->hw->set_gid(hr_dev, port, attr->index,
(union ib_gid *)gid, attr);
ret = hr_dev->hw->set_gid(hr_dev, port, attr->index, &attr->gid, attr);
spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
@ -208,7 +206,8 @@ static int hns_roce_query_device(struct ib_device *ib_dev,
props->max_qp_wr = hr_dev->caps.max_wqes;
props->device_cap_flags = IB_DEVICE_PORT_ACTIVE_EVENT |
IB_DEVICE_RC_RNR_NAK_GEN;
props->max_sge = max(hr_dev->caps.max_sq_sg, hr_dev->caps.max_rq_sg);
props->max_send_sge = hr_dev->caps.max_sq_sg;
props->max_recv_sge = hr_dev->caps.max_rq_sg;
props->max_sge_rd = 1;
props->max_cq = hr_dev->caps.num_cqs;
props->max_cqe = hr_dev->caps.max_cqes;
@ -535,6 +534,9 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
(1ULL << IB_USER_VERBS_CMD_QUERY_QP) |
(1ULL << IB_USER_VERBS_CMD_DESTROY_QP);
ib_dev->uverbs_ex_cmd_mask |=
(1ULL << IB_USER_VERBS_EX_CMD_MODIFY_CQ);
/* HCA||device||port */
ib_dev->modify_device = hns_roce_modify_device;
ib_dev->query_device = hns_roce_query_device;
@ -887,8 +889,7 @@ error_failed_cmd_init:
error_failed_cmq_init:
if (hr_dev->hw->reset) {
ret = hr_dev->hw->reset(hr_dev, false);
if (ret)
if (hr_dev->hw->reset(hr_dev, false))
dev_err(dev, "Dereset RoCE engine failed!\n");
}

View File

@ -37,7 +37,7 @@
static int hns_roce_pd_alloc(struct hns_roce_dev *hr_dev, unsigned long *pdn)
{
return hns_roce_bitmap_alloc(&hr_dev->pd_bitmap, pdn);
return hns_roce_bitmap_alloc(&hr_dev->pd_bitmap, pdn) ? -ENOMEM : 0;
}
static void hns_roce_pd_free(struct hns_roce_dev *hr_dev, unsigned long pdn)

View File

@ -115,7 +115,10 @@ static int hns_roce_reserve_range_qp(struct hns_roce_dev *hr_dev, int cnt,
{
struct hns_roce_qp_table *qp_table = &hr_dev->qp_table;
return hns_roce_bitmap_alloc_range(&qp_table->bitmap, cnt, align, base);
return hns_roce_bitmap_alloc_range(&qp_table->bitmap, cnt, align,
base) ?
-ENOMEM :
0;
}
enum hns_roce_qp_state to_hns_roce_state(enum ib_qp_state state)
@ -489,6 +492,14 @@ static int hns_roce_set_kernel_sq_size(struct hns_roce_dev *hr_dev,
return 0;
}
static int hns_roce_qp_has_sq(struct ib_qp_init_attr *attr)
{
if (attr->qp_type == IB_QPT_XRC_TGT)
return 0;
return 1;
}
static int hns_roce_qp_has_rq(struct ib_qp_init_attr *attr)
{
if (attr->qp_type == IB_QPT_XRC_INI ||
@ -613,6 +624,23 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
goto err_mtt;
}
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SQ_RECORD_DB) &&
(udata->inlen >= sizeof(ucmd)) &&
(udata->outlen >= sizeof(resp)) &&
hns_roce_qp_has_sq(init_attr)) {
ret = hns_roce_db_map_user(
to_hr_ucontext(ib_pd->uobject->context),
ucmd.sdb_addr, &hr_qp->sdb);
if (ret) {
dev_err(dev, "sq record doorbell map failed!\n");
goto err_mtt;
}
/* indicate kernel supports sq record db */
resp.cap_flags |= HNS_ROCE_SUPPORT_SQ_RECORD_DB;
hr_qp->sdb_en = 1;
}
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) &&
(udata->outlen >= sizeof(resp)) &&
hns_roce_qp_has_rq(init_attr)) {
@ -621,7 +649,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
ucmd.db_addr, &hr_qp->rdb);
if (ret) {
dev_err(dev, "rq record doorbell map failed!\n");
goto err_mtt;
goto err_sq_dbmap;
}
}
} else {
@ -734,7 +762,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
if (ib_pd->uobject && (udata->outlen >= sizeof(resp)) &&
(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB)) {
/* indicate kernel supports record db */
/* indicate kernel supports rq record db */
resp.cap_flags |= HNS_ROCE_SUPPORT_RQ_RECORD_DB;
ret = ib_copy_to_udata(udata, &resp, sizeof(resp));
if (ret)
@ -770,6 +798,16 @@ err_wrid:
kfree(hr_qp->rq.wrid);
}
err_sq_dbmap:
if (ib_pd->uobject)
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SQ_RECORD_DB) &&
(udata->inlen >= sizeof(ucmd)) &&
(udata->outlen >= sizeof(resp)) &&
hns_roce_qp_has_sq(init_attr))
hns_roce_db_unmap_user(
to_hr_ucontext(ib_pd->uobject->context),
&hr_qp->sdb);
err_mtt:
hns_roce_mtt_cleanup(hr_dev, &hr_qp->mtt);
@ -903,6 +941,17 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
new_state = attr_mask & IB_QP_STATE ?
attr->qp_state : cur_state;
if (ibqp->uobject &&
(attr_mask & IB_QP_STATE) && new_state == IB_QPS_ERR) {
if (hr_qp->sdb_en == 1) {
hr_qp->sq.head = *(int *)(hr_qp->sdb.virt_addr);
hr_qp->rq.head = *(int *)(hr_qp->rdb.virt_addr);
} else {
dev_warn(dev, "flush cqe is not supported in userspace!\n");
goto out;
}
}
if (!ib_modify_qp_is_ok(cur_state, new_state, ibqp->qp_type, attr_mask,
IB_LINK_LAYER_ETHERNET)) {
dev_err(dev, "ib_modify_qp_is_ok failed\n");

View File

@ -1,6 +1,7 @@
config INFINIBAND_I40IW
tristate "Intel(R) Ethernet X722 iWARP Driver"
depends on INET && I40E
depends on IPV6 || !IPV6
depends on PCI
select GENERIC_ALLOCATOR
---help---

View File

@ -57,6 +57,7 @@
#include <net/addrconf.h>
#include <net/ip6_route.h>
#include <net/ip_fib.h>
#include <net/secure_seq.h>
#include <net/tcp.h>
#include <asm/checksum.h>
@ -2164,7 +2165,6 @@ static struct i40iw_cm_node *i40iw_make_cm_node(
struct i40iw_cm_listener *listener)
{
struct i40iw_cm_node *cm_node;
struct timespec ts;
int oldarpindex;
int arpindex;
struct net_device *netdev = iwdev->netdev;
@ -2214,10 +2214,26 @@ static struct i40iw_cm_node *i40iw_make_cm_node(
cm_node->tcp_cntxt.rcv_wscale = I40IW_CM_DEFAULT_RCV_WND_SCALE;
cm_node->tcp_cntxt.rcv_wnd =
I40IW_CM_DEFAULT_RCV_WND_SCALED >> I40IW_CM_DEFAULT_RCV_WND_SCALE;
ts = current_kernel_time();
cm_node->tcp_cntxt.loc_seq_num = ts.tv_nsec;
cm_node->tcp_cntxt.mss = (cm_node->ipv4) ? (iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV4) :
(iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV6);
if (cm_node->ipv4) {
cm_node->tcp_cntxt.loc_seq_num = secure_tcp_seq(htonl(cm_node->loc_addr[0]),
htonl(cm_node->rem_addr[0]),
htons(cm_node->loc_port),
htons(cm_node->rem_port));
cm_node->tcp_cntxt.mss = iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV4;
} else if (IS_ENABLED(CONFIG_IPV6)) {
__be32 loc[4] = {
htonl(cm_node->loc_addr[0]), htonl(cm_node->loc_addr[1]),
htonl(cm_node->loc_addr[2]), htonl(cm_node->loc_addr[3])
};
__be32 rem[4] = {
htonl(cm_node->rem_addr[0]), htonl(cm_node->rem_addr[1]),
htonl(cm_node->rem_addr[2]), htonl(cm_node->rem_addr[3])
};
cm_node->tcp_cntxt.loc_seq_num = secure_tcpv6_seq(loc, rem,
htons(cm_node->loc_port),
htons(cm_node->rem_port));
cm_node->tcp_cntxt.mss = iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV6;
}
cm_node->iwdev = iwdev;
cm_node->dev = &iwdev->sc_dev;

View File

@ -435,45 +435,24 @@ void i40iw_process_aeq(struct i40iw_device *iwdev)
}
/**
* i40iw_manage_apbvt - add or delete tcp port
* i40iw_cqp_manage_abvpt_cmd - send cqp command manage abpvt
* @iwdev: iwarp device
* @accel_local_port: port for apbvt
* @add_port: add or delete port
*/
int i40iw_manage_apbvt(struct i40iw_device *iwdev, u16 accel_local_port, bool add_port)
static enum i40iw_status_code
i40iw_cqp_manage_abvpt_cmd(struct i40iw_device *iwdev,
u16 accel_local_port,
bool add_port)
{
struct i40iw_apbvt_info *info;
struct i40iw_cqp_request *cqp_request;
struct cqp_commands_info *cqp_info;
unsigned long flags;
struct i40iw_cm_core *cm_core = &iwdev->cm_core;
enum i40iw_status_code status = 0;
bool in_use;
/* apbvt_lock is held across CQP delete APBVT OP (non-waiting) to
* protect against race where add APBVT CQP can race ahead of the delete
* APBVT for same port.
*/
spin_lock_irqsave(&cm_core->apbvt_lock, flags);
if (!add_port) {
in_use = i40iw_port_in_use(cm_core, accel_local_port);
if (in_use)
goto exit;
clear_bit(accel_local_port, cm_core->ports_in_use);
} else {
in_use = test_and_set_bit(accel_local_port,
cm_core->ports_in_use);
spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
if (in_use)
return 0;
}
enum i40iw_status_code status;
cqp_request = i40iw_get_cqp_request(&iwdev->cqp, add_port);
if (!cqp_request) {
status = -ENOMEM;
goto exit;
}
if (!cqp_request)
return I40IW_ERR_NO_MEMORY;
cqp_info = &cqp_request->info;
info = &cqp_info->in.u.manage_apbvt_entry.info;
@ -489,13 +468,53 @@ int i40iw_manage_apbvt(struct i40iw_device *iwdev, u16 accel_local_port, bool ad
status = i40iw_handle_cqp_op(iwdev, cqp_request);
if (status)
i40iw_pr_err("CQP-OP Manage APBVT entry fail");
exit:
if (!add_port)
spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
return status;
}
/**
* i40iw_manage_apbvt - add or delete tcp port
* @iwdev: iwarp device
* @accel_local_port: port for apbvt
* @add_port: add or delete port
*/
enum i40iw_status_code i40iw_manage_apbvt(struct i40iw_device *iwdev,
u16 accel_local_port,
bool add_port)
{
struct i40iw_cm_core *cm_core = &iwdev->cm_core;
enum i40iw_status_code status;
unsigned long flags;
bool in_use;
/* apbvt_lock is held across CQP delete APBVT OP (non-waiting) to
* protect against race where add APBVT CQP can race ahead of the delete
* APBVT for same port.
*/
if (add_port) {
spin_lock_irqsave(&cm_core->apbvt_lock, flags);
in_use = __test_and_set_bit(accel_local_port,
cm_core->ports_in_use);
spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
if (in_use)
return 0;
return i40iw_cqp_manage_abvpt_cmd(iwdev, accel_local_port,
true);
} else {
spin_lock_irqsave(&cm_core->apbvt_lock, flags);
in_use = i40iw_port_in_use(cm_core, accel_local_port);
if (in_use) {
spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
return 0;
}
__clear_bit(accel_local_port, cm_core->ports_in_use);
status = i40iw_cqp_manage_abvpt_cmd(iwdev, accel_local_port,
false);
spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
return status;
}
}
/**
* i40iw_manage_arp_cache - manage hw arp cache
* @iwdev: iwarp device

View File

@ -71,7 +71,8 @@ static int i40iw_query_device(struct ib_device *ibdev,
props->max_mr_size = I40IW_MAX_OUTBOUND_MESSAGE_SIZE;
props->max_qp = iwdev->max_qp - iwdev->used_qps;
props->max_qp_wr = I40IW_MAX_QP_WRS;
props->max_sge = I40IW_MAX_WQ_FRAGMENT_COUNT;
props->max_send_sge = I40IW_MAX_WQ_FRAGMENT_COUNT;
props->max_recv_sge = I40IW_MAX_WQ_FRAGMENT_COUNT;
props->max_cq = iwdev->max_cq - iwdev->used_cqs;
props->max_cqe = iwdev->max_cqe;
props->max_mr = iwdev->max_mr - iwdev->used_mrs;
@ -1409,6 +1410,7 @@ static void i40iw_set_hugetlb_values(u64 addr, struct i40iw_mr *iwmr)
struct vm_area_struct *vma;
struct hstate *h;
down_read(&current->mm->mmap_sem);
vma = find_vma(current->mm, addr);
if (vma && is_vm_hugetlb_page(vma)) {
h = hstate_vma(vma);
@ -1417,6 +1419,7 @@ static void i40iw_set_hugetlb_values(u64 addr, struct i40iw_mr *iwmr)
iwmr->page_msk = huge_page_mask(h);
}
}
up_read(&current->mm->mmap_sem);
}
/**
@ -2198,8 +2201,8 @@ static void i40iw_copy_sg_list(struct i40iw_sge *sg_list, struct ib_sge *sgl, in
* @bad_wr: return of bad wr if err
*/
static int i40iw_post_send(struct ib_qp *ibqp,
struct ib_send_wr *ib_wr,
struct ib_send_wr **bad_wr)
const struct ib_send_wr *ib_wr,
const struct ib_send_wr **bad_wr)
{
struct i40iw_qp *iwqp;
struct i40iw_qp_uk *ukqp;
@ -2374,9 +2377,8 @@ out:
* @ib_wr: work request for receive
* @bad_wr: bad wr caused an error
*/
static int i40iw_post_recv(struct ib_qp *ibqp,
struct ib_recv_wr *ib_wr,
struct ib_recv_wr **bad_wr)
static int i40iw_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *ib_wr,
const struct ib_recv_wr **bad_wr)
{
struct i40iw_qp *iwqp;
struct i40iw_qp_uk *ukqp;
@ -2700,21 +2702,6 @@ static int i40iw_query_gid(struct ib_device *ibdev,
return 0;
}
/**
* i40iw_modify_port Modify port properties
* @ibdev: device pointer from stack
* @port: port number
* @port_modify_mask: mask for port modifications
* @props: port properties
*/
static int i40iw_modify_port(struct ib_device *ibdev,
u8 port,
int port_modify_mask,
struct ib_port_modify *props)
{
return -ENOSYS;
}
/**
* i40iw_query_pkey - Query partition key
* @ibdev: device pointer from stack
@ -2731,28 +2718,6 @@ static int i40iw_query_pkey(struct ib_device *ibdev,
return 0;
}
/**
* i40iw_create_ah - create address handle
* @ibpd: ptr of pd
* @ah_attr: address handle attributes
*/
static struct ib_ah *i40iw_create_ah(struct ib_pd *ibpd,
struct rdma_ah_attr *attr,
struct ib_udata *udata)
{
return ERR_PTR(-ENOSYS);
}
/**
* i40iw_destroy_ah - Destroy address handle
* @ah: pointer to address handle
*/
static int i40iw_destroy_ah(struct ib_ah *ah)
{
return -ENOSYS;
}
/**
* i40iw_get_vector_affinity - report IRQ affinity mask
* @ibdev: IB device
@ -2820,7 +2785,6 @@ static struct i40iw_ib_device *i40iw_init_rdma_device(struct i40iw_device *iwdev
iwibdev->ibdev.num_comp_vectors = iwdev->ceqs_count;
iwibdev->ibdev.dev.parent = &pcidev->dev;
iwibdev->ibdev.query_port = i40iw_query_port;
iwibdev->ibdev.modify_port = i40iw_modify_port;
iwibdev->ibdev.query_pkey = i40iw_query_pkey;
iwibdev->ibdev.query_gid = i40iw_query_gid;
iwibdev->ibdev.alloc_ucontext = i40iw_alloc_ucontext;
@ -2840,8 +2804,6 @@ static struct i40iw_ib_device *i40iw_init_rdma_device(struct i40iw_device *iwdev
iwibdev->ibdev.alloc_hw_stats = i40iw_alloc_hw_stats;
iwibdev->ibdev.get_hw_stats = i40iw_get_hw_stats;
iwibdev->ibdev.query_device = i40iw_query_device;
iwibdev->ibdev.create_ah = i40iw_create_ah;
iwibdev->ibdev.destroy_ah = i40iw_destroy_ah;
iwibdev->ibdev.drain_sq = i40iw_drain_sq;
iwibdev->ibdev.drain_rq = i40iw_drain_rq;
iwibdev->ibdev.alloc_mr = i40iw_alloc_mr;

View File

@ -82,12 +82,11 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd,
struct mlx4_ib_ah *ah)
{
struct mlx4_ib_dev *ibdev = to_mdev(pd->device);
const struct ib_gid_attr *gid_attr;
struct mlx4_dev *dev = ibdev->dev;
int is_mcast = 0;
struct in6_addr in6;
u16 vlan_tag = 0xffff;
union ib_gid sgid;
struct ib_gid_attr gid_attr;
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
int ret;
@ -96,25 +95,30 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd,
is_mcast = 1;
memcpy(ah->av.eth.mac, ah_attr->roce.dmac, ETH_ALEN);
ret = ib_get_cached_gid(pd->device, rdma_ah_get_port_num(ah_attr),
grh->sgid_index, &sgid, &gid_attr);
if (ret)
return ERR_PTR(ret);
eth_zero_addr(ah->av.eth.s_mac);
if (is_vlan_dev(gid_attr.ndev))
vlan_tag = vlan_dev_vlan_id(gid_attr.ndev);
memcpy(ah->av.eth.s_mac, gid_attr.ndev->dev_addr, ETH_ALEN);
dev_put(gid_attr.ndev);
/*
* If sgid_attr is NULL we are being called by mlx4_ib_create_ah_slave
* and we are directly creating an AV for a slave's gid_index.
*/
gid_attr = ah_attr->grh.sgid_attr;
if (gid_attr) {
if (is_vlan_dev(gid_attr->ndev))
vlan_tag = vlan_dev_vlan_id(gid_attr->ndev);
memcpy(ah->av.eth.s_mac, gid_attr->ndev->dev_addr, ETH_ALEN);
ret = mlx4_ib_gid_index_to_real_index(ibdev, gid_attr);
if (ret < 0)
return ERR_PTR(ret);
ah->av.eth.gid_index = ret;
} else {
/* mlx4_ib_create_ah_slave fills in the s_mac and the vlan */
ah->av.eth.gid_index = ah_attr->grh.sgid_index;
}
if (vlan_tag < 0x1000)
vlan_tag |= (rdma_ah_get_sl(ah_attr) & 7) << 13;
ah->av.eth.port_pd = cpu_to_be32(to_mpd(pd)->pdn |
(rdma_ah_get_port_num(ah_attr) << 24));
ret = mlx4_ib_gid_index_to_real_index(ibdev,
rdma_ah_get_port_num(ah_attr),
grh->sgid_index);
if (ret < 0)
return ERR_PTR(ret);
ah->av.eth.gid_index = ret;
ah->av.eth.vlan = cpu_to_be16(vlan_tag);
ah->av.eth.hop_limit = grh->hop_limit;
if (rdma_ah_get_static_rate(ah_attr)) {
@ -173,6 +177,40 @@ struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct rdma_ah_attr *ah_attr,
return create_ib_ah(pd, ah_attr, ah); /* never fails */
}
/* AH's created via this call must be free'd by mlx4_ib_destroy_ah. */
struct ib_ah *mlx4_ib_create_ah_slave(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
int slave_sgid_index, u8 *s_mac,
u16 vlan_tag)
{
struct rdma_ah_attr slave_attr = *ah_attr;
struct mlx4_ib_ah *mah;
struct ib_ah *ah;
slave_attr.grh.sgid_attr = NULL;
slave_attr.grh.sgid_index = slave_sgid_index;
ah = mlx4_ib_create_ah(pd, &slave_attr, NULL);
if (IS_ERR(ah))
return ah;
ah->device = pd->device;
ah->pd = pd;
ah->type = ah_attr->type;
mah = to_mah(ah);
/* get rid of force-loopback bit */
mah->av.ib.port_pd &= cpu_to_be32(0x7FFFFFFF);
if (ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE)
memcpy(mah->av.eth.s_mac, s_mac, 6);
if (vlan_tag < 0x1000)
vlan_tag |= (rdma_ah_get_sl(ah_attr) & 7) << 13;
mah->av.eth.vlan = cpu_to_be16(vlan_tag);
return ah;
}
int mlx4_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
{
struct mlx4_ib_ah *ah = to_mah(ibah);

View File

@ -506,7 +506,7 @@ int mlx4_ib_send_to_slave(struct mlx4_ib_dev *dev, int slave, u8 port,
{
struct ib_sge list;
struct ib_ud_wr wr;
struct ib_send_wr *bad_wr;
const struct ib_send_wr *bad_wr;
struct mlx4_ib_demux_pv_ctx *tun_ctx;
struct mlx4_ib_demux_pv_qp *tun_qp;
struct mlx4_rcv_tunnel_mad *tun_mad;
@ -1310,7 +1310,8 @@ static int mlx4_ib_post_pv_qp_buf(struct mlx4_ib_demux_pv_ctx *ctx,
int index)
{
struct ib_sge sg_list;
struct ib_recv_wr recv_wr, *bad_recv_wr;
struct ib_recv_wr recv_wr;
const struct ib_recv_wr *bad_recv_wr;
int size;
size = (tun_qp->qp->qp_type == IB_QPT_UD) ?
@ -1361,19 +1362,16 @@ int mlx4_ib_send_to_wire(struct mlx4_ib_dev *dev, int slave, u8 port,
{
struct ib_sge list;
struct ib_ud_wr wr;
struct ib_send_wr *bad_wr;
const struct ib_send_wr *bad_wr;
struct mlx4_ib_demux_pv_ctx *sqp_ctx;
struct mlx4_ib_demux_pv_qp *sqp;
struct mlx4_mad_snd_buf *sqp_mad;
struct ib_ah *ah;
struct ib_qp *send_qp = NULL;
struct ib_global_route *grh;
unsigned wire_tx_ix = 0;
int ret = 0;
u16 wire_pkey_ix;
int src_qpnum;
u8 sgid_index;
sqp_ctx = dev->sriov.sqps[port-1];
@ -1394,16 +1392,11 @@ int mlx4_ib_send_to_wire(struct mlx4_ib_dev *dev, int slave, u8 port,
send_qp = sqp->qp;
/* create ah */
grh = rdma_ah_retrieve_grh(attr);
sgid_index = grh->sgid_index;
grh->sgid_index = 0;
ah = rdma_create_ah(sqp_ctx->pd, attr);
ah = mlx4_ib_create_ah_slave(sqp_ctx->pd, attr,
rdma_ah_retrieve_grh(attr)->sgid_index,
s_mac, vlan_id);
if (IS_ERR(ah))
return -ENOMEM;
grh->sgid_index = sgid_index;
to_mah(ah)->av.ib.gid_index = sgid_index;
/* get rid of force-loopback bit */
to_mah(ah)->av.ib.port_pd &= cpu_to_be32(0x7FFFFFFF);
spin_lock(&sqp->tx_lock);
if (sqp->tx_ix_head - sqp->tx_ix_tail >=
(MLX4_NUM_TUNNEL_BUFS - 1))
@ -1445,12 +1438,6 @@ int mlx4_ib_send_to_wire(struct mlx4_ib_dev *dev, int slave, u8 port,
wr.wr.num_sge = 1;
wr.wr.opcode = IB_WR_SEND;
wr.wr.send_flags = IB_SEND_SIGNALED;
if (s_mac)
memcpy(to_mah(ah)->av.eth.s_mac, s_mac, 6);
if (vlan_id < 0x1000)
vlan_id |= (rdma_ah_get_sl(attr) & 7) << 13;
to_mah(ah)->av.eth.vlan = cpu_to_be16(vlan_id);
ret = ib_post_send(send_qp, &wr.wr, &bad_wr);
if (!ret)
@ -1461,7 +1448,7 @@ int mlx4_ib_send_to_wire(struct mlx4_ib_dev *dev, int slave, u8 port,
spin_unlock(&sqp->tx_lock);
sqp->tx_ring[wire_tx_ix].ah = NULL;
out:
rdma_destroy_ah(ah);
mlx4_ib_destroy_ah(ah);
return ret;
}

View File

@ -246,9 +246,7 @@ static int mlx4_ib_update_gids(struct gid_entry *gids,
return mlx4_ib_update_gids_v1(gids, ibdev, port_num);
}
static int mlx4_ib_add_gid(const union ib_gid *gid,
const struct ib_gid_attr *attr,
void **context)
static int mlx4_ib_add_gid(const struct ib_gid_attr *attr, void **context)
{
struct mlx4_ib_dev *ibdev = to_mdev(attr->device);
struct mlx4_ib_iboe *iboe = &ibdev->iboe;
@ -271,8 +269,9 @@ static int mlx4_ib_add_gid(const union ib_gid *gid,
port_gid_table = &iboe->gids[attr->port_num - 1];
spin_lock_bh(&iboe->lock);
for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i) {
if (!memcmp(&port_gid_table->gids[i].gid, gid, sizeof(*gid)) &&
(port_gid_table->gids[i].gid_type == attr->gid_type)) {
if (!memcmp(&port_gid_table->gids[i].gid,
&attr->gid, sizeof(attr->gid)) &&
port_gid_table->gids[i].gid_type == attr->gid_type) {
found = i;
break;
}
@ -289,7 +288,8 @@ static int mlx4_ib_add_gid(const union ib_gid *gid,
ret = -ENOMEM;
} else {
*context = port_gid_table->gids[free].ctx;
memcpy(&port_gid_table->gids[free].gid, gid, sizeof(*gid));
memcpy(&port_gid_table->gids[free].gid,
&attr->gid, sizeof(attr->gid));
port_gid_table->gids[free].gid_type = attr->gid_type;
port_gid_table->gids[free].ctx->real_index = free;
port_gid_table->gids[free].ctx->refcount = 1;
@ -380,17 +380,15 @@ static int mlx4_ib_del_gid(const struct ib_gid_attr *attr, void **context)
}
int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
u8 port_num, int index)
const struct ib_gid_attr *attr)
{
struct mlx4_ib_iboe *iboe = &ibdev->iboe;
struct gid_cache_context *ctx = NULL;
union ib_gid gid;
struct mlx4_port_gid_table *port_gid_table;
int real_index = -EINVAL;
int i;
int ret;
unsigned long flags;
struct ib_gid_attr attr;
u8 port_num = attr->port_num;
if (port_num > MLX4_MAX_PORTS)
return -EINVAL;
@ -399,21 +397,15 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
port_num = 1;
if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num))
return index;
ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, &attr);
if (ret)
return ret;
if (attr.ndev)
dev_put(attr.ndev);
return attr->index;
spin_lock_irqsave(&iboe->lock, flags);
port_gid_table = &iboe->gids[port_num - 1];
for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)
if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) &&
attr.gid_type == port_gid_table->gids[i].gid_type) {
if (!memcmp(&port_gid_table->gids[i].gid,
&attr->gid, sizeof(attr->gid)) &&
attr->gid_type == port_gid_table->gids[i].gid_type) {
ctx = port_gid_table->gids[i].ctx;
break;
}
@ -525,8 +517,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->page_size_cap = dev->dev->caps.page_size_cap;
props->max_qp = dev->dev->quotas.qp;
props->max_qp_wr = dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE;
props->max_sge = min(dev->dev->caps.max_sq_sg,
dev->dev->caps.max_rq_sg);
props->max_send_sge = dev->dev->caps.max_sq_sg;
props->max_recv_sge = dev->dev->caps.max_rq_sg;
props->max_sge_rd = MLX4_MAX_SGE_RD;
props->max_cq = dev->dev->quotas.cq;
props->max_cqe = dev->dev->caps.max_cqes;
@ -770,7 +762,8 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
IB_WIDTH_4X : IB_WIDTH_1X;
props->active_speed = (((u8 *)mailbox->buf)[5] == 0x20 /*56Gb*/) ?
IB_SPEED_FDR : IB_SPEED_QDR;
props->port_cap_flags = IB_PORT_CM_SUP | IB_PORT_IP_BASED_GIDS;
props->port_cap_flags = IB_PORT_CM_SUP;
props->ip_gids = true;
props->gid_tbl_len = mdev->dev->caps.gid_table_len[port];
props->max_msg_sz = mdev->dev->caps.max_msg_sz;
props->pkey_tbl_len = 1;
@ -2709,6 +2702,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.modify_qp = mlx4_ib_modify_qp;
ibdev->ib_dev.query_qp = mlx4_ib_query_qp;
ibdev->ib_dev.destroy_qp = mlx4_ib_destroy_qp;
ibdev->ib_dev.drain_sq = mlx4_ib_drain_sq;
ibdev->ib_dev.drain_rq = mlx4_ib_drain_rq;
ibdev->ib_dev.post_send = mlx4_ib_post_send;
ibdev->ib_dev.post_recv = mlx4_ib_post_recv;
ibdev->ib_dev.create_cq = mlx4_ib_create_cq;

View File

@ -322,7 +322,6 @@ struct mlx4_ib_qp {
u32 doorbell_qpn;
__be32 sq_signal_bits;
unsigned sq_next_wqe;
int sq_max_wqes_per_wr;
int sq_spare_wqes;
struct mlx4_ib_wq sq;
@ -760,6 +759,10 @@ void mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq);
struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct rdma_ah_attr *ah_attr,
struct ib_udata *udata);
struct ib_ah *mlx4_ib_create_ah_slave(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
int slave_sgid_index, u8 *s_mac,
u16 vlan_tag);
int mlx4_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
int mlx4_ib_destroy_ah(struct ib_ah *ah);
@ -771,21 +774,23 @@ int mlx4_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
int mlx4_ib_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
int mlx4_ib_destroy_srq(struct ib_srq *srq);
void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index);
int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr);
int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr);
struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata);
int mlx4_ib_destroy_qp(struct ib_qp *qp);
void mlx4_ib_drain_sq(struct ib_qp *qp);
void mlx4_ib_drain_rq(struct ib_qp *qp);
int mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata);
int mlx4_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr);
int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr);
int mlx4_ib_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr);
int mlx4_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr);
int mlx4_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr);
int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int mad_ifc_flags,
int port, const struct ib_wc *in_wc, const struct ib_grh *in_grh,
@ -900,7 +905,7 @@ int mlx4_ib_rereg_user_mr(struct ib_mr *mr, int flags,
int mr_access_flags, struct ib_pd *pd,
struct ib_udata *udata);
int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
u8 port_num, int index);
const struct ib_gid_attr *attr);
void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
int port);

View File

@ -204,89 +204,24 @@ static void *get_send_wqe(struct mlx4_ib_qp *qp, int n)
/*
* Stamp a SQ WQE so that it is invalid if prefetched by marking the
* first four bytes of every 64 byte chunk with
* 0x7FFFFFF | (invalid_ownership_value << 31).
*
* When the max work request size is less than or equal to the WQE
* basic block size, as an optimization, we can stamp all WQEs with
* 0xffffffff, and skip the very first chunk of each WQE.
* first four bytes of every 64 byte chunk with 0xffffffff, except for
* the very first chunk of the WQE.
*/
static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n, int size)
static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n)
{
__be32 *wqe;
int i;
int s;
int ind;
void *buf;
__be32 stamp;
struct mlx4_wqe_ctrl_seg *ctrl;
if (qp->sq_max_wqes_per_wr > 1) {
s = roundup(size, 1U << qp->sq.wqe_shift);
for (i = 0; i < s; i += 64) {
ind = (i >> qp->sq.wqe_shift) + n;
stamp = ind & qp->sq.wqe_cnt ? cpu_to_be32(0x7fffffff) :
cpu_to_be32(0xffffffff);
buf = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
wqe = buf + (i & ((1 << qp->sq.wqe_shift) - 1));
*wqe = stamp;
}
} else {
ctrl = buf = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1));
buf = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1));
ctrl = (struct mlx4_wqe_ctrl_seg *)buf;
s = (ctrl->qpn_vlan.fence_size & 0x3f) << 4;
for (i = 64; i < s; i += 64) {
wqe = buf + i;
*wqe = cpu_to_be32(0xffffffff);
}
}
}
static void post_nop_wqe(struct mlx4_ib_qp *qp, int n, int size)
{
struct mlx4_wqe_ctrl_seg *ctrl;
struct mlx4_wqe_inline_seg *inl;
void *wqe;
int s;
ctrl = wqe = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1));
s = sizeof(struct mlx4_wqe_ctrl_seg);
if (qp->ibqp.qp_type == IB_QPT_UD) {
struct mlx4_wqe_datagram_seg *dgram = wqe + sizeof *ctrl;
struct mlx4_av *av = (struct mlx4_av *)dgram->av;
memset(dgram, 0, sizeof *dgram);
av->port_pd = cpu_to_be32((qp->port << 24) | to_mpd(qp->ibqp.pd)->pdn);
s += sizeof(struct mlx4_wqe_datagram_seg);
}
/* Pad the remainder of the WQE with an inline data segment. */
if (size > s) {
inl = wqe + s;
inl->byte_count = cpu_to_be32(1 << 31 | (size - s - sizeof *inl));
}
ctrl->srcrb_flags = 0;
ctrl->qpn_vlan.fence_size = size / 16;
/*
* Make sure descriptor is fully written before setting ownership bit
* (because HW can start executing as soon as we do).
*/
wmb();
ctrl->owner_opcode = cpu_to_be32(MLX4_OPCODE_NOP | MLX4_WQE_CTRL_NEC) |
(n & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0);
stamp_send_wqe(qp, n + qp->sq_spare_wqes, size);
}
/* Post NOP WQE to prevent wrap-around in the middle of WR */
static inline unsigned pad_wraparound(struct mlx4_ib_qp *qp, int ind)
{
unsigned s = qp->sq.wqe_cnt - (ind & (qp->sq.wqe_cnt - 1));
if (unlikely(s < qp->sq_max_wqes_per_wr)) {
post_nop_wqe(qp, ind, s << qp->sq.wqe_shift);
ind += s;
}
return ind;
}
static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type)
@ -433,8 +368,7 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
}
static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
enum mlx4_ib_qp_type type, struct mlx4_ib_qp *qp,
bool shrink_wqe)
enum mlx4_ib_qp_type type, struct mlx4_ib_qp *qp)
{
int s;
@ -461,69 +395,19 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
if (s > dev->dev->caps.max_sq_desc_sz)
return -EINVAL;
/*
* Hermon supports shrinking WQEs, such that a single work
* request can include multiple units of 1 << wqe_shift. This
* way, work requests can differ in size, and do not have to
* be a power of 2 in size, saving memory and speeding up send
* WR posting. Unfortunately, if we do this then the
* wqe_index field in CQEs can't be used to look up the WR ID
* anymore, so we do this only if selective signaling is off.
*
* Further, on 32-bit platforms, we can't use vmap() to make
* the QP buffer virtually contiguous. Thus we have to use
* constant-sized WRs to make sure a WR is always fully within
* a single page-sized chunk.
*
* Finally, we use NOP work requests to pad the end of the
* work queue, to avoid wrap-around in the middle of WR. We
* set NEC bit to avoid getting completions with error for
* these NOP WRs, but since NEC is only supported starting
* with firmware 2.2.232, we use constant-sized WRs for older
* firmware.
*
* And, since MLX QPs only support SEND, we use constant-sized
* WRs in this case.
*
* We look for the smallest value of wqe_shift such that the
* resulting number of wqes does not exceed device
* capabilities.
*
* We set WQE size to at least 64 bytes, this way stamping
* invalidates each WQE.
*/
if (shrink_wqe && dev->dev->caps.fw_ver >= MLX4_FW_VER_WQE_CTRL_NEC &&
qp->sq_signal_bits && BITS_PER_LONG == 64 &&
type != MLX4_IB_QPT_SMI && type != MLX4_IB_QPT_GSI &&
!(type & (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_PROXY_SMI |
MLX4_IB_QPT_PROXY_GSI | MLX4_IB_QPT_TUN_SMI_OWNER)))
qp->sq.wqe_shift = ilog2(64);
else
qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s));
for (;;) {
qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1U << qp->sq.wqe_shift);
/*
* We need to leave 2 KB + 1 WR of headroom in the SQ to
* allow HW to prefetch.
*/
qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + qp->sq_max_wqes_per_wr;
qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr *
qp->sq_max_wqes_per_wr +
qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + 1;
qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr +
qp->sq_spare_wqes);
if (qp->sq.wqe_cnt <= dev->dev->caps.max_wqes)
break;
if (qp->sq_max_wqes_per_wr <= 1)
return -EINVAL;
++qp->sq.wqe_shift;
}
qp->sq.max_gs = (min(dev->dev->caps.max_sq_desc_sz,
(qp->sq_max_wqes_per_wr << qp->sq.wqe_shift)) -
qp->sq.max_gs =
(min(dev->dev->caps.max_sq_desc_sz,
(1 << qp->sq.wqe_shift)) -
send_wqe_overhead(type, qp->flags)) /
sizeof (struct mlx4_wqe_data_seg);
@ -538,7 +422,7 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
}
cap->max_send_wr = qp->sq.max_post =
(qp->sq.wqe_cnt - qp->sq_spare_wqes) / qp->sq_max_wqes_per_wr;
qp->sq.wqe_cnt - qp->sq_spare_wqes;
cap->max_send_sge = min(qp->sq.max_gs,
min(dev->dev->caps.max_sq_sg,
dev->dev->caps.max_rq_sg));
@ -977,7 +861,6 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
{
int qpn;
int err;
struct ib_qp_cap backup_cap;
struct mlx4_ib_sqp *sqp = NULL;
struct mlx4_ib_qp *qp;
enum mlx4_ib_qp_type qp_type = (enum mlx4_ib_qp_type) init_attr->qp_type;
@ -1178,9 +1061,7 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
goto err;
}
memcpy(&backup_cap, &init_attr->cap, sizeof(backup_cap));
err = set_kernel_sq_size(dev, &init_attr->cap,
qp_type, qp, true);
err = set_kernel_sq_size(dev, &init_attr->cap, qp_type, qp);
if (err)
goto err;
@ -1192,21 +1073,11 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
*qp->db.db = 0;
}
if (mlx4_buf_alloc(dev->dev, qp->buf_size, qp->buf_size,
if (mlx4_buf_alloc(dev->dev, qp->buf_size, PAGE_SIZE * 2,
&qp->buf)) {
memcpy(&init_attr->cap, &backup_cap,
sizeof(backup_cap));
err = set_kernel_sq_size(dev, &init_attr->cap, qp_type,
qp, false);
if (err)
goto err_db;
if (mlx4_buf_alloc(dev->dev, qp->buf_size,
PAGE_SIZE * 2, &qp->buf)) {
err = -ENOMEM;
goto err_db;
}
}
err = mlx4_mtt_init(dev->dev, qp->buf.npages, qp->buf.page_shift,
&qp->mtt);
@ -1859,8 +1730,7 @@ static int _mlx4_set_path(struct mlx4_ib_dev *dev,
if (rdma_ah_get_ah_flags(ah) & IB_AH_GRH) {
const struct ib_global_route *grh = rdma_ah_read_grh(ah);
int real_sgid_index =
mlx4_ib_gid_index_to_real_index(dev, port,
grh->sgid_index);
mlx4_ib_gid_index_to_real_index(dev, grh->sgid_attr);
if (real_sgid_index < 0)
return real_sgid_index;
@ -2176,6 +2046,7 @@ static int __mlx4_ib_modify_qp(void *src, enum mlx4_ib_source_type src_type,
{
struct ib_uobject *ibuobject;
struct ib_srq *ibsrq;
const struct ib_gid_attr *gid_attr = NULL;
struct ib_rwq_ind_table *rwq_ind_tbl;
enum ib_qp_type qp_type;
struct mlx4_ib_dev *dev;
@ -2356,29 +2227,17 @@ static int __mlx4_ib_modify_qp(void *src, enum mlx4_ib_source_type src_type,
if (attr_mask & IB_QP_AV) {
u8 port_num = mlx4_is_bonded(dev->dev) ? 1 :
attr_mask & IB_QP_PORT ? attr->port_num : qp->port;
union ib_gid gid;
struct ib_gid_attr gid_attr = {.gid_type = IB_GID_TYPE_IB};
u16 vlan = 0xffff;
u8 smac[ETH_ALEN];
int status = 0;
int is_eth =
rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
rdma_ah_get_ah_flags(&attr->ah_attr) & IB_AH_GRH;
if (is_eth) {
int index =
rdma_ah_read_grh(&attr->ah_attr)->sgid_index;
status = ib_get_cached_gid(&dev->ib_dev, port_num,
index, &gid, &gid_attr);
if (!status) {
vlan = rdma_vlan_dev_vlan_id(gid_attr.ndev);
memcpy(smac, gid_attr.ndev->dev_addr, ETH_ALEN);
dev_put(gid_attr.ndev);
gid_attr = attr->ah_attr.grh.sgid_attr;
vlan = rdma_vlan_dev_vlan_id(gid_attr->ndev);
memcpy(smac, gid_attr->ndev->dev_addr, ETH_ALEN);
}
}
if (status)
goto out;
if (mlx4_set_path(dev, attr, attr_mask, qp, &context->pri_path,
port_num, vlan, smac))
@ -2389,7 +2248,7 @@ static int __mlx4_ib_modify_qp(void *src, enum mlx4_ib_source_type src_type,
if (is_eth &&
(cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR)) {
u8 qpc_roce_mode = gid_type_to_qpc(gid_attr.gid_type);
u8 qpc_roce_mode = gid_type_to_qpc(gid_attr->gid_type);
if (qpc_roce_mode == MLX4_QPC_ROCE_MODE_UNDEFINED) {
err = -EINVAL;
@ -2594,11 +2453,9 @@ static int __mlx4_ib_modify_qp(void *src, enum mlx4_ib_source_type src_type,
for (i = 0; i < qp->sq.wqe_cnt; ++i) {
ctrl = get_send_wqe(qp, i);
ctrl->owner_opcode = cpu_to_be32(1 << 31);
if (qp->sq_max_wqes_per_wr == 1)
ctrl->qpn_vlan.fence_size =
1 << (qp->sq.wqe_shift - 4);
stamp_send_wqe(qp, i, 1 << qp->sq.wqe_shift);
stamp_send_wqe(qp, i);
}
}
@ -2937,7 +2794,7 @@ static int vf_get_qp0_qkey(struct mlx4_dev *dev, int qpn, u32 *qkey)
}
static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
struct ib_ud_wr *wr,
const struct ib_ud_wr *wr,
void *wqe, unsigned *mlx_seg_len)
{
struct mlx4_ib_dev *mdev = to_mdev(sqp->qp.ibqp.device);
@ -3085,7 +2942,7 @@ static int fill_gid_by_hw_index(struct mlx4_ib_dev *ibdev, u8 port_num,
}
#define MLX4_ROCEV2_QP1_SPORT 0xC000
static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
static int build_mlx_header(struct mlx4_ib_sqp *sqp, const struct ib_ud_wr *wr,
void *wqe, unsigned *mlx_seg_len)
{
struct ib_device *ib_dev = sqp->qp.ibqp.device;
@ -3181,10 +3038,8 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
to_mdev(ib_dev)->sriov.demux[sqp->qp.port - 1].
guid_cache[ah->av.ib.gid_index];
} else {
ib_get_cached_gid(ib_dev,
be32_to_cpu(ah->av.ib.port_pd) >> 24,
ah->av.ib.gid_index,
&sqp->ud_header.grh.source_gid, NULL);
sqp->ud_header.grh.source_gid =
ah->ibah.sgid_attr->gid;
}
}
memcpy(sqp->ud_header.grh.destination_gid.raw,
@ -3369,7 +3224,7 @@ static __be32 convert_access(int acc)
}
static void set_reg_seg(struct mlx4_wqe_fmr_seg *fseg,
struct ib_reg_wr *wr)
const struct ib_reg_wr *wr)
{
struct mlx4_ib_mr *mr = to_mmr(wr->mr);
@ -3399,7 +3254,7 @@ static __always_inline void set_raddr_seg(struct mlx4_wqe_raddr_seg *rseg,
}
static void set_atomic_seg(struct mlx4_wqe_atomic_seg *aseg,
struct ib_atomic_wr *wr)
const struct ib_atomic_wr *wr)
{
if (wr->wr.opcode == IB_WR_ATOMIC_CMP_AND_SWP) {
aseg->swap_add = cpu_to_be64(wr->swap);
@ -3415,7 +3270,7 @@ static void set_atomic_seg(struct mlx4_wqe_atomic_seg *aseg,
}
static void set_masked_atomic_seg(struct mlx4_wqe_masked_atomic_seg *aseg,
struct ib_atomic_wr *wr)
const struct ib_atomic_wr *wr)
{
aseg->swap_add = cpu_to_be64(wr->swap);
aseg->swap_add_mask = cpu_to_be64(wr->swap_mask);
@ -3424,7 +3279,7 @@ static void set_masked_atomic_seg(struct mlx4_wqe_masked_atomic_seg *aseg,
}
static void set_datagram_seg(struct mlx4_wqe_datagram_seg *dseg,
struct ib_ud_wr *wr)
const struct ib_ud_wr *wr)
{
memcpy(dseg->av, &to_mah(wr->ah)->av, sizeof (struct mlx4_av));
dseg->dqpn = cpu_to_be32(wr->remote_qpn);
@ -3435,7 +3290,7 @@ static void set_datagram_seg(struct mlx4_wqe_datagram_seg *dseg,
static void set_tunnel_datagram_seg(struct mlx4_ib_dev *dev,
struct mlx4_wqe_datagram_seg *dseg,
struct ib_ud_wr *wr,
const struct ib_ud_wr *wr,
enum mlx4_ib_qp_type qpt)
{
union mlx4_ext_av *av = &to_mah(wr->ah)->av;
@ -3457,7 +3312,8 @@ static void set_tunnel_datagram_seg(struct mlx4_ib_dev *dev,
dseg->qkey = cpu_to_be32(IB_QP_SET_QKEY);
}
static void build_tunnel_header(struct ib_ud_wr *wr, void *wqe, unsigned *mlx_seg_len)
static void build_tunnel_header(const struct ib_ud_wr *wr, void *wqe,
unsigned *mlx_seg_len)
{
struct mlx4_wqe_inline_seg *inl = wqe;
struct mlx4_ib_tunnel_header hdr;
@ -3540,9 +3396,9 @@ static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ib_sge *sg)
dseg->addr = cpu_to_be64(sg->addr);
}
static int build_lso_seg(struct mlx4_wqe_lso_seg *wqe, struct ib_ud_wr *wr,
struct mlx4_ib_qp *qp, unsigned *lso_seg_len,
__be32 *lso_hdr_sz, __be32 *blh)
static int build_lso_seg(struct mlx4_wqe_lso_seg *wqe,
const struct ib_ud_wr *wr, struct mlx4_ib_qp *qp,
unsigned *lso_seg_len, __be32 *lso_hdr_sz, __be32 *blh)
{
unsigned halign = ALIGN(sizeof *wqe + wr->hlen, 16);
@ -3560,7 +3416,7 @@ static int build_lso_seg(struct mlx4_wqe_lso_seg *wqe, struct ib_ud_wr *wr,
return 0;
}
static __be32 send_ieth(struct ib_send_wr *wr)
static __be32 send_ieth(const struct ib_send_wr *wr)
{
switch (wr->opcode) {
case IB_WR_SEND_WITH_IMM:
@ -3582,8 +3438,8 @@ static void add_zero_len_inline(void *wqe)
inl->byte_count = cpu_to_be32(1 << 31);
}
int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr)
static int _mlx4_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr, bool drain)
{
struct mlx4_ib_qp *qp = to_mqp(ibqp);
void *wqe;
@ -3593,7 +3449,6 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
int nreq;
int err = 0;
unsigned ind;
int uninitialized_var(stamp);
int uninitialized_var(size);
unsigned uninitialized_var(seglen);
__be32 dummy;
@ -3623,7 +3478,8 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
}
spin_lock_irqsave(&qp->sq.lock, flags);
if (mdev->dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
if (mdev->dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR &&
!drain) {
err = -EIO;
*bad_wr = wr;
nreq = 0;
@ -3865,22 +3721,14 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
ctrl->owner_opcode = mlx4_ib_opcode[wr->opcode] |
(ind & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0) | blh;
stamp = ind + qp->sq_spare_wqes;
ind += DIV_ROUND_UP(size * 16, 1U << qp->sq.wqe_shift);
/*
* We can improve latency by not stamping the last
* send queue WQE until after ringing the doorbell, so
* only stamp here if there are still more WQEs to post.
*
* Same optimization applies to padding with NOP wqe
* in case of WQE shrinking (used to prevent wrap-around
* in the middle of WR).
*/
if (wr->next) {
stamp_send_wqe(qp, stamp, size * 16);
ind = pad_wraparound(qp, ind);
}
if (wr->next)
stamp_send_wqe(qp, ind + qp->sq_spare_wqes);
ind++;
}
out:
@ -3902,9 +3750,8 @@ out:
*/
mmiowb();
stamp_send_wqe(qp, stamp, size * 16);
stamp_send_wqe(qp, ind + qp->sq_spare_wqes - 1);
ind = pad_wraparound(qp, ind);
qp->sq_next_wqe = ind;
}
@ -3913,8 +3760,14 @@ out:
return err;
}
int mlx4_ib_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
int mlx4_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
const struct ib_send_wr **bad_wr)
{
return _mlx4_ib_post_send(ibqp, wr, bad_wr, false);
}
static int _mlx4_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr, bool drain)
{
struct mlx4_ib_qp *qp = to_mqp(ibqp);
struct mlx4_wqe_data_seg *scat;
@ -3929,7 +3782,8 @@ int mlx4_ib_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
max_gs = qp->rq.max_gs;
spin_lock_irqsave(&qp->rq.lock, flags);
if (mdev->dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
if (mdev->dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR &&
!drain) {
err = -EIO;
*bad_wr = wr;
nreq = 0;
@ -4000,6 +3854,12 @@ out:
return err;
}
int mlx4_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
return _mlx4_ib_post_recv(ibqp, wr, bad_wr, false);
}
static inline enum ib_qp_state to_ib_qp_state(enum mlx4_qp_state mlx4_state)
{
switch (mlx4_state) {
@ -4047,9 +3907,9 @@ static void to_rdma_ah_attr(struct mlx4_ib_dev *ibdev,
u8 port_num = path->sched_queue & 0x40 ? 2 : 1;
memset(ah_attr, 0, sizeof(*ah_attr));
ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, port_num);
if (port_num == 0 || port_num > dev->caps.num_ports)
return;
ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, port_num);
if (ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE)
rdma_ah_set_sl(ah_attr, ((path->sched_queue >> 3) & 0x7) |
@ -4465,3 +4325,132 @@ int mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
kfree(ib_rwq_ind_tbl);
return 0;
}
struct mlx4_ib_drain_cqe {
struct ib_cqe cqe;
struct completion done;
};
static void mlx4_ib_drain_qp_done(struct ib_cq *cq, struct ib_wc *wc)
{
struct mlx4_ib_drain_cqe *cqe = container_of(wc->wr_cqe,
struct mlx4_ib_drain_cqe,
cqe);
complete(&cqe->done);
}
/* This function returns only once the drained WR was completed */
static void handle_drain_completion(struct ib_cq *cq,
struct mlx4_ib_drain_cqe *sdrain,
struct mlx4_ib_dev *dev)
{
struct mlx4_dev *mdev = dev->dev;
if (cq->poll_ctx == IB_POLL_DIRECT) {
while (wait_for_completion_timeout(&sdrain->done, HZ / 10) <= 0)
ib_process_cq_direct(cq, -1);
return;
}
if (mdev->persist->state == MLX4_DEVICE_STATE_INTERNAL_ERROR) {
struct mlx4_ib_cq *mcq = to_mcq(cq);
bool triggered = false;
unsigned long flags;
spin_lock_irqsave(&dev->reset_flow_resource_lock, flags);
/* Make sure that the CQ handler won't run if wasn't run yet */
if (!mcq->mcq.reset_notify_added)
mcq->mcq.reset_notify_added = 1;
else
triggered = true;
spin_unlock_irqrestore(&dev->reset_flow_resource_lock, flags);
if (triggered) {
/* Wait for any scheduled/running task to be ended */
switch (cq->poll_ctx) {
case IB_POLL_SOFTIRQ:
irq_poll_disable(&cq->iop);
irq_poll_enable(&cq->iop);
break;
case IB_POLL_WORKQUEUE:
cancel_work_sync(&cq->work);
break;
default:
WARN_ON_ONCE(1);
}
}
/* Run the CQ handler - this makes sure that the drain WR will
* be processed if wasn't processed yet.
*/
mcq->mcq.comp(&mcq->mcq);
}
wait_for_completion(&sdrain->done);
}
void mlx4_ib_drain_sq(struct ib_qp *qp)
{
struct ib_cq *cq = qp->send_cq;
struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
struct mlx4_ib_drain_cqe sdrain;
const struct ib_send_wr *bad_swr;
struct ib_rdma_wr swr = {
.wr = {
.next = NULL,
{ .wr_cqe = &sdrain.cqe, },
.opcode = IB_WR_RDMA_WRITE,
},
};
int ret;
struct mlx4_ib_dev *dev = to_mdev(qp->device);
struct mlx4_dev *mdev = dev->dev;
ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
if (ret && mdev->persist->state != MLX4_DEVICE_STATE_INTERNAL_ERROR) {
WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
return;
}
sdrain.cqe.done = mlx4_ib_drain_qp_done;
init_completion(&sdrain.done);
ret = _mlx4_ib_post_send(qp, &swr.wr, &bad_swr, true);
if (ret) {
WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
return;
}
handle_drain_completion(cq, &sdrain, dev);
}
void mlx4_ib_drain_rq(struct ib_qp *qp)
{
struct ib_cq *cq = qp->recv_cq;
struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
struct mlx4_ib_drain_cqe rdrain;
struct ib_recv_wr rwr = {};
const struct ib_recv_wr *bad_rwr;
int ret;
struct mlx4_ib_dev *dev = to_mdev(qp->device);
struct mlx4_dev *mdev = dev->dev;
ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
if (ret && mdev->persist->state != MLX4_DEVICE_STATE_INTERNAL_ERROR) {
WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret);
return;
}
rwr.wr_cqe = &rdrain.cqe;
rdrain.cqe.done = mlx4_ib_drain_qp_done;
init_completion(&rdrain.done);
ret = _mlx4_ib_post_recv(qp, &rwr, &bad_rwr, true);
if (ret) {
WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret);
return;
}
handle_drain_completion(cq, &rdrain, dev);
}

View File

@ -307,8 +307,8 @@ void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index)
spin_unlock(&srq->lock);
}
int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
struct mlx4_ib_srq *srq = to_msrq(ibsrq);
struct mlx4_wqe_srq_next_seg *next;

View File

@ -3,3 +3,5 @@ obj-$(CONFIG_MLX5_INFINIBAND) += mlx5_ib.o
mlx5_ib-y := main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o gsi.o ib_virt.o cmd.o cong.o
mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
mlx5_ib-$(CONFIG_MLX5_ESWITCH) += ib_rep.o
mlx5_ib-$(CONFIG_INFINIBAND_USER_ACCESS) += devx.o
mlx5_ib-$(CONFIG_INFINIBAND_USER_ACCESS) += flow.o

View File

@ -37,7 +37,6 @@ static struct ib_ah *create_ib_ah(struct mlx5_ib_dev *dev,
struct rdma_ah_attr *ah_attr)
{
enum ib_gid_type gid_type;
int err;
if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
@ -53,18 +52,12 @@ static struct ib_ah *create_ib_ah(struct mlx5_ib_dev *dev,
ah->av.stat_rate_sl = (rdma_ah_get_static_rate(ah_attr) << 4);
if (ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) {
err = mlx5_get_roce_gid_type(dev, ah_attr->port_num,
ah_attr->grh.sgid_index,
&gid_type);
if (err)
return ERR_PTR(err);
gid_type = ah_attr->grh.sgid_attr->gid_type;
memcpy(ah->av.rmac, ah_attr->roce.dmac,
sizeof(ah_attr->roce.dmac));
ah->av.udp_sport =
mlx5_get_roce_udp_sport(dev,
rdma_ah_get_port_num(ah_attr),
rdma_ah_read_grh(ah_attr)->sgid_index);
mlx5_get_roce_udp_sport(dev, ah_attr->grh.sgid_attr);
ah->av.stat_rate_sl |= (rdma_ah_get_sl(ah_attr) & 0x7) << 1;
if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP)
#define MLX5_ECN_ENABLED BIT(1)

View File

@ -185,3 +185,15 @@ int mlx5_cmd_dealloc_memic(struct mlx5_memic *memic, u64 addr, u64 length)
return err;
}
int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out)
{
u32 in[MLX5_ST_SZ_DW(ppcnt_reg)] = {};
int sz = MLX5_ST_SZ_BYTES(ppcnt_reg);
MLX5_SET(ppcnt_reg, in, local_port, 1);
MLX5_SET(ppcnt_reg, in, grp, MLX5_ETHERNET_EXTENDED_COUNTERS_GROUP);
return mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PPCNT,
0, 0);
}

View File

@ -41,6 +41,7 @@ int mlx5_cmd_dump_fill_mkey(struct mlx5_core_dev *dev, u32 *mkey);
int mlx5_cmd_null_mkey(struct mlx5_core_dev *dev, u32 *null_mkey);
int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
void *out, int out_size);
int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out);
int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *mdev,
void *in, int in_size);
int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr,

View File

@ -359,9 +359,6 @@ static ssize_t get_param(struct file *filp, char __user *buf, size_t count,
int ret;
char lbuf[11];
if (*pos)
return 0;
ret = mlx5_ib_get_cc_params(param->dev, param->port_num, offset, &var);
if (ret)
return ret;
@ -370,11 +367,7 @@ static ssize_t get_param(struct file *filp, char __user *buf, size_t count,
if (ret < 0)
return ret;
if (copy_to_user(buf, lbuf, ret))
return -EFAULT;
*pos += ret;
return ret;
return simple_read_from_buffer(buf, count, pos, lbuf, ret);
}
static const struct file_operations dbg_cc_fops = {

View File

@ -1184,7 +1184,7 @@ int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period)
int err;
if (!MLX5_CAP_GEN(dev->mdev, cq_moderation))
return -ENOSYS;
return -EOPNOTSUPP;
if (cq_period > MLX5_MAX_CQ_PERIOD)
return -EINVAL;

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More