Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ccid3_hc_tx_send_packet currently returns 0 when the time difference between
current time and t_nom is less than 1000 microseconds.
In this case the packet is sent immediately; but, unlike other packets that can
be emitted on first attempt, it will not have its window counter updated and
its options set as required. This is a bug.
Fix: Require the time difference to be at least 1000 microseconds. The
algorithm then converges: time differences > 1000 microseconds trigger the
timer in dccp_write_xmit; after timer expiry this function is tried again; when
the time difference is less than 1000, the packet will have its options added
and window counter updated as required.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
This updates the computation of t_nom and t_last_win_count to use the newer
gettimeofday interface.
Committer note: used ktime_to_timeval to set the 'now' variable to t_ld in
ccid3hctx_no_feedback_timer
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
It had just a slab cache, so, for the sake of simplicity just make
dccp_trfc_lib module init routine create the slab cache, no need for users of
the lib to create a private loss_interval object.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now ccid3_hc_rx_update_li is ready to be moved to
net/dccp/ccids/lib/loss_interval, it uses the same interface as the other
functions there.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparatory patch for moving these loss interval functions from
net/dccp/ccids/ccid3.c to net/dccp/ccids/lib/loss_interval.c.
Based on a patch by Ian McDonald.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
When compiling with EXTRA_CFLAGS=-W noticed that tstamp is not initialised
correctly in dccp_li_calc_first_li.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
This fixes an error in the calculation of t_ipi when X converges towards
very low sending rates (between 1 and 64 bytes per second).
Although this case may not sound likely, it can be reproduced by connecting,
hitting enter (1 byte sent) and waiting for some time, during which the
nofeedback timer halves the sending rate until finally it reaches the region
1..64 bytes/sec. Computing X is handled correctly (tested separately); but by
dividing X _before_ entering the calculation of t_ipi, X becomes zero as
a result. This in turn triggers a BUG condition caught in scaled_div().
Fixed by replacing with equivalent statement and explicit typecast for good
measure.
Calculation verified and effect of patch tested - reduced never below 1 byte
per 64 seconds afterwards, i.e. not allowing divide-by-zero.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch follows the following recommendation made in an erratum to RFC 4342:
"Senders MAY additionally make use of other available RTT measurements,
including those from the initial Request-Response packet exchange."
It implements larger initial windows with regard to this inital RTT measurement,
using the mechanism suggested in draft-ietf-dccp-rfc3448bis, section 4.2.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This replaces the existing occurrences of RTT sampling with
the use of the new function dccp_sample_rtt.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CCID 3 and TFRC specs (RFC 4342, RFC 3448, draft-3448bis) make frequent
reference to the computation of the RFC-3390 initial sending rate:
1. Initial sending rate when RTT is known (RFC 4342, p. 6)
2. Response to Idle/Application-Limited periods (RFC 4342, 5.1)
This warrants putting the code into its own function, for later code reuse.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This clears the following sparc64 build warnings:
1) warning: format "%ld" expects type "long int", but argument 3 has type "suseconds_t"
2) warning: format "%llu" expects type "long long unsigned int", but argument 3 has type "__u64"
Fixed by using typecast to unsigned. This is argued to be safe, since the quantities, after
de-scaling (factor 2^6) fit all in u32.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This:
1. removes a race condition in the access to the scheduled send time t_nom which
results from allowing asynchronous r/w access to t_nom without locks;
2. updates the inter-packet interval t_ipi = s/X when `s' changes, following a
suggestion by Ian McDonald.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a few debugging statements to ccid3.c
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug which uses an invalid comparison.
The bug resulted in the use of invalid loss intervals.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This improves the slow-start phase by using the MSS
(as suggested in RFC 4342, sec. 5) instead of the packet size s.
Also figured out that __u32 is ample resource enough.
After applying, I got the following in the logs:
ccid3_hc_tx_packet_recv: client(f7421700), s=6, MSS=1424, w_init=4380, R_sample=176us, X=24886363
Had the previous variant been used, w_init would have been as low as 24.
Committer note: removed unneeded cast to unsigned long long that was
causing a compiler warning on 64bit architectures.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No code change at all.
This splits ccid3.c into a RX and a TX section, so that the file has an
organisation similar to the other ones (e.g. packet_history.{h,c}).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since CCID3 avoids sending 0-byte data packets (cf. ccid3_hc_tx_send_packet),
testing for zero-payload length, as performed by ccid3_hc_tx_update_s, is
redundant - hence removed by this patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts an earlier patch which disabled bidirectional mode, meaning that
a listening (passive) socket was not allowed to write to the other (active)
end of the connection.
This mode had been disabled when there were problems with CCID3, but it
imposes a constraint on socket programming and thus hinders deployment.
A change is included to ignore RX feedback received by the TX CCID3 module.
Many thanks to Andre Noll for pointing out this issue.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/dccp/ccids/ccid3.c: In function `ccid3_hc_rx_packet_recv':
net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 3)
net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 4)
opaque types must be suitably cast for printing.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a recent patch we introduced invalid return codes which will result in the
opposite of what is intended (i.e. send more packets in face of peculiar
network conditions).
This fixes it by returning ~0 which means not calculated as per
dccp_li_hist_calc_i_mean.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
That accumulated over the last months hackaton, shame on me for not
using git-apply whitespace helping hand, will do that from now on.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Spotted by David Miller when compiling on sparc64, I reproduced it here on
parisc64, that are the only platforms to define __kernel_suseconds_t as an
'int', all the others, x86_64 and x86 included typedef it as a 'long', but from
the definition of suseconds_t it should just be an 'int' on platforms where it
is >= 32bits, it would not require all the castings from suseconds_t to (int)
when printking variables of this type, that are not needed on parisc64 and
sparc64.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This fixes conversion errors which arose by not properly type-casting
from u32 to __u64. Fixed by explicitly casting each type which is not
__u64, or by performing operation after assignment.
The patch further adds missing debug information to track the current
value of X_recv.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
No code change at all.
This reorders the source file to follow the same order as the corresponding
header file.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
No code change at all.
To make the header file easier to read, the following ordering is established
among the declarations:
* hist_new
* hist_delete
* hist_entry_new
* hist_head
* hist_find_entry
* hist_add_entry
* hist_entry_delete
* hist_purge
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch does not alter any algorithm, just the debug message format:
* s#%s, sk=%p#%s(%p)#g
* when a statename is present, it now uses %s(%p, state=%s)
* when only function entry is debugged, it adds an `- entry'
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This migrates all packet history operations into the routine
ccid3_hc_tx_packet_sent, thereby removing synchronization problems
that occur when, as before, the operations are spread over multiple
routines.
The following minor simplifications are also applied:
* several simplifications now follow from this change - several tests
are now no longer required
* removal of one unnecessary variable (dp)
Justification:
Currently packet history operations span two different routines,
one of which is likely to pass through several iterations of sleeping
and awakening.
The first routine, ccid3_hc_tx_send_packet, allocates an entry and
sets a few fields. The remaining fields are filled in when the second
routine (which is not within a sleeping context), ccid3_hc_tx_packet_sent,
is called. This has several strong drawbacks:
* it is not necessary to split history operations - all fields can be
filled in by the second routine
* the first routine is called multiple times, until a packet can be sent,
and sleeps meanwhile - this causes a lot of difficulties with regard to
keeping the list consistent
* since both routines do not have a producer-consumer like synchronization,
it is very difficult to maintain data across calls to these routines
* the fact that the routines are called in different contexts (sleeping, not
sleeping) adds further problems
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This removes the `dccphtx_ccval' field since it is nowhere used in the code and
in fact not necessary for the accounting.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This puts the window counter computation [RFC 4342, 8.1] into a separate
function which is called whenever a new packet is ready for immediate
transmission in ccid3_hc_tx_send_packet.
Justification:
The window counter update was previously computed after the packet was sent. This has
two drawbacks, both fixed by this patch:
1) re-compute another timestamp almost directly after the packet was sent (expensive),
2) the CCVal for the window counter is needed at the instant the packet is sent.
Further details:
The initialisation of the window counter is left in the state NO_SENT, as before.
The algorithm will do nothing if either RTT is initialised to 0 (which is ok) or if
the RTT value remains below 4 microseconds (which is almost pathological).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
CCID3 performance depends much on the accuracy of RTT samples. If RTT
samples grow too large, performance can be catastrophically poor.
To limit the amount of possible damage in such cases, the patch
* introduces an upper limit which identifies a maximum `sane' RTT value;
* uses a macro to enforce this upper limit.
Using a macro was given preference, since it is necessary to identify the
calling function in the warning message. Since exceeding this threshold
identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN.
Many thanks to Ian McDonald for collaboration on this issue.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
In both the sender and the receiver it is possible that the stored
RTT value is accessed before an actual RTT estimate has been computed.
This patch
* initialises the sender RTT to 0
- the sender always accesses the RTT in ccid3_hc_tx_packet_sent
- the RTT is further needed for the window counter algorithm
* replaces the receiver initialisation of 5msec with 0
- which has the same effect and removes an `XXX'
- the RTT value is needed in ccid3_hc_rx_packet_recv as rtt_prev
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The function ccid3_hc_tx_insert_options only does a redundant no-op,
as the operation
DCCP_SKB_CB(skb)->dccpd_ccval = hctx->ccid3hctx_last_win_count;
is already performed _unconditionally_ in ccid3_hc_tx_send_packet.
Since there is further no current need for this function, it is removed
entirely. Since furthermore, there is actually no present need for the
entire interface function ccid_hc_tx_insert_options, it was decided to
remove it also, to clean up the interface.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is an optimisation to reduce CPU load. The received feedback is now
only directed to the active CCID component, without requiring processing
also by the inactive one.
As a consequence, a similar test in ccid3.c is now redundant and is
also removed.
Justification:
Currently DCCP works as a unidirectional service, i.e. a listening server
is not at the same time a connecting client.
As far as I can see, several modifications are necessary until that
becomes possible.
At the present time, received feedback is both fed to the rx/tx CCID
modules. In unidirectional service, only one of these is active at any
one time.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
In migrating towards using the newer functions scaled_div/scaled_div32
for TFRC computations mapped from floating-point onto integer arithmetic,
this completes the last stage of modifications.
In particular, the overflow case for computing X_calc is circumvented by
* breaking the computation into two stages
* the first stage, res = (s*1E6)/R, cannot overflow due to use of u64
* in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due
to (i) returning UINT_MAX in this case (which is logically appropriate)
and (ii) issuing a warning message into the system log (since very likely
there is a problem somewhere else with the parameters)
Lastly, all such scaling operations are now exported into tfrc.h, since
actually this form of scaled computation is specific to TFRC and not to CCID3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Problem:
Most target types in the CCID3 code are u32, so subtle conversion errors
can occur if signed time calculations yield negative results: the original
values are lost in the conversion to unsigned, calculation errors go undetected.
This patch therefore
* sets all critical time types from unsigned to suseconds_t
* avoids comparison between signed/unsigned via type-casting
* provides ample warning messages in case time calculations are negative
These warning messages can be removed at a later stage when the code
has undergone more testing.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This simplifies the calculation of a value p for a given fval when the
first loss interval is computed (RFC 3448, 6.3.1). It makes use of the
two new functions scaled_div/scaled_div32 to provide overflow protection.
Additionally, protection against divide-by-zero is extended - in this
case the function will return the maximally possible value of p=100%.
Background:
The maximum fval, f(100%), is approximately 244, i.e. the scaled value of fval
should never exceed 244E6, which fits easily into u32. The problem is the scaling
by 10^6, since additionally R(TT) is in microseconds.
This is resolved by breaking the division into two stages: the first stage
computes fval=(s*10^6)/R, stores that into u64; the second stage computes
fval = (fval*10^6)/X_recv and complains if overflow is reached for u32.
This case is safe since the TFRC reverse-lookup routine then returns p=100%.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This replaces the remaining uses of usecs_div with scaled_div32, which
internally uses 64bit division and produces a warning on overflow.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch
* resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0
* supports all sending rates from 1/64 bytes/second up to 4Gbyte/second
* simplifies the present overflow problems in calculations
Current sending rate X and the cached value X_recv of the receiver-estimated
sending rate are both scaled by 64 (2^6) in order to
* cope with low sending rates (minimally 1 byte/second)
* allow upgrading to use a packets-per-second implementation of CCID 3
* avoid calculation errors due to integer arithmetic cut-off
The patch implements a revised strategy from
http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html
The only difference with regard to that strategy is that t_ipi is already
used in the calculation of the nofeedback timeout, which saves one division.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This fixes
1) a bug in the recomputation of the sending rate by the nofeedback
timer when no feedback at all has so far been sent by the receiver:
min_t was used instead of max_t, which is wrong (cf. RFC 3448, p. 10);
2) an error in the computation of larger initial windows: instead of
min(... max()) (cf. RFC 4342, 5.), the code had used max(... max()).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>