This patch stops the dlm_recv workqueue from busy-waiting when a node
disconnects. This can cause soft lockup errors on debug systems and bad
performance generally.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Now that there can be multiple dlm_recv threads running we need to prevent two
recvs running for the same connection - it's unlikely but it can happen and it
causes message corruption.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch fixes a bug whereby data on a newly accepted connection would be
ignored if it arrived soon after the accept.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch removes some redundant fields from the connection structure and adds
some lockdep annotation to remove spurious warnings.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch converts the DLM TCP lowcomms to use workqueues rather than using its
own daemon functions. Simultaneously removing a lot of code and making it more
scalable on multi-processor machines.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch). No functional changes in this patch, just naming changes.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
I just noticed this message when testing some other changes I'd made to
lowcomms (to use workqueues) but the problem seems to be in the current
git trees too. I'm amazed no-one has seen it.
BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.
These four should really be calls to schedule() or the dlm can busy-wait.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Remove the following unused functions:
- lowcomms_send_message()
- lowcomms_max_buffer_size()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch fixes a compile warning in lowcomms-tcp.c indicating that
kmem_cache_t is deprecated.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This fixes up most of the things pointed out by akpm and Pavel Machek
with comments below indicating why some things have been left:
Andrew Morton wrote:
>
>> +static struct nodeinfo *nodeid2nodeinfo(int nodeid, gfp_t alloc)
>> +{
>> + struct nodeinfo *ni;
>> + int r;
>> + int n;
>> +
>> + down_read(&nodeinfo_lock);
>
> Given that this function can sleep, I wonder if `alloc' is useful.
>
> I see lots of callers passing in a literal "0" for `alloc'. That's in fact
> a secret (GFP_ATOMIC & ~__GFP_HIGH). I doubt if that's what you really
> meant. Particularly as the code could at least have used __GFP_WAIT (aka
> GFP_NOIO) which is much, much more reliable than "0". In fact "0" is the
> least reliable mode possible.
>
> IOW, this is all bollixed up.
When 0 is passed into nodeid2nodeinfo the function does not try to allocate a
new structure at all. it's an indication that the caller only wants the nodeinfo
struct for that nodeid if there actually is one in existance.
I've tidied the function itself so it's more obvious, (and tidier!)
>> +/* Data received from remote end */
>> +static int receive_from_sock(void)
>> +{
>> + int ret = 0;
>> + struct msghdr msg;
>> + struct kvec iov[2];
>> + unsigned len;
>> + int r;
>> + struct sctp_sndrcvinfo *sinfo;
>> + struct cmsghdr *cmsg;
>> + struct nodeinfo *ni;
>> +
>> + /* These two are marginally too big for stack allocation, but this
>> + * function is (currently) only called by dlm_recvd so static should be
>> + * OK.
>> + */
>> + static struct sockaddr_storage msgname;
>> + static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> whoa. This is globally singly-threaded code??
Yes. it is only ever run in the context of dlm_recvd.
>>
>> +static void initiate_association(int nodeid)
>> +{
>> + struct sockaddr_storage rem_addr;
>> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> Another static buffer to worry about. Globally singly-threaded code?
Yes. Only ever called by dlm_sendd.
>> +
>> +/* Send a message */
>> +static int send_to_sock(struct nodeinfo *ni)
>> +{
>> + int ret = 0;
>> + struct writequeue_entry *e;
>> + int len, offset;
>> + struct msghdr outmsg;
>> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> Singly-threaded?
Yep.
>>
>> +static void dealloc_nodeinfo(void)
>> +{
>> + int i;
>> +
>> + for (i=1; i<=max_nodeid; i++) {
>> + struct nodeinfo *ni = nodeid2nodeinfo(i, 0);
>> + if (ni) {
>> + idr_remove(&nodeinfo_idr, i);
>
> Didn't that need locking?
Not. it's only ever called at DLM shutdown after all the other threads
have been stopped.
>>
>> +static int write_list_empty(void)
>> +{
>> + int status;
>> +
>> + spin_lock_bh(&write_nodes_lock);
>> + status = list_empty(&write_nodes);
>> + spin_unlock_bh(&write_nodes_lock);
>> +
>> + return status;
>> +}
>
> This function's return value is meaningless. As soon as the lock gets
> dropped, the return value can get out of sync with reality.
>
> Looking at the caller, this _might_ happen to be OK, but it's a nasty and
> dangerous thing. Really the locking should be moved into the caller.
It's just an optimisation to allow the caller to schedule if there is no work
to do. if something arrives immediately afterwards then it will get picked up
when the process re-awakes (and it will be woken by that arrival).
The 'accepting' atomic has gone completely. as Andrew pointed out it didn't
really achieve much anyway. I suspect it was a plaster over some other
startup or shutdown bug to be honest.
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Pavel Machek <pavel@ucw.cz>
The following patch adds a TCP based communications layer
to the DLM which is compile time selectable. The existing SCTP
layer gives the advantage of allowing multihoming, whereas
the TCP layer has been heavily tested in previous versions of
the DLM and is known to be robust and therefore can be used as
a baseline for performance testing.
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>