The source and destination addresses are included to allow channel
selection based on address alignment.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Pass a full set of flags to drivers' per-operation 'prep' routines.
Currently the only flag passed is DMA_PREP_INTERRUPT. The expectation is
that arch-specific async_tx_find_channel() implementations can exploit this
capability to find the best channel for an operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
The tx_set_src and tx_set_dest methods were originally implemented to allow
an array of addresses to be passed down from async_xor to the dmaengine
driver while minimizing stack overhead. Removing these methods allows
drivers to have all transaction parameters available at 'prep' time, saves
two function pointers in struct dma_async_tx_descriptor, and reduces the
number of indirect branches..
A consequence of moving this data to the 'prep' routine is that
multi-source routines like async_xor need temporary storage to convert an
array of linear addresses into an array of dma addresses. In order to keep
the same stack footprint of the previous implementation the input array is
reused as storage for the dma addresses. This requires that
sizeof(dma_addr_t) be less than or equal to sizeof(void *). As a
consequence CONFIG_DMADEVICES now depends on !CONFIG_HIGHMEM64G. It also
requires that drivers be able to make descriptor resources available when
the 'prep' routine is polled.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Remove the unused ASYNC_TX_ASSUME_COHERENT flag. Async_tx is
meant to hide the difference between asynchronous hardware and synchronous
software operations, this flag requires clients to understand cache
coherency consequences of the async path.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
single list_head variable initialized with LIST_HEAD_INIT could almost
always can be replaced with LIST_HEAD declaration, this shrinks the code
and looks better.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
do_async_xor must be compiled away on !HAS_DMA archs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits)
[CRYPTO] twofish: Merge common glue code
[CRYPTO] hifn_795x: Fixup container_of() usage
[CRYPTO] cast6: inline bloat--
[CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long
[CRYPTO] tcrypt: Make xcbc available as a standalone test
[CRYPTO] xcbc: Remove bogus hash/cipher test
[CRYPTO] xcbc: Fix algorithm leak when block size check fails
[CRYPTO] tcrypt: Zero axbuf in the right function
[CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
[CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h
[CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20
[CRYPTO] tcrypt: Add select of AEAD
[CRYPTO] salsa20: Add x86-64 assembly version
[CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
[CRYPTO] gcm: Introduce rfc4106
[CRYPTO] api: Show async type
[CRYPTO] chainiv: Avoid lock spinning where possible
[CRYPTO] seqiv: Add select AEAD in Kconfig
[CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy
[CRYPTO] null: Allow setkey on digest_null
...
Currently the gcm(aes) tests have to be taken together with all other
algorithms. This patch makes it available by itself at number 106.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When setting the digest size xcbc tests to see if the underlying algorithm
is a hash. This is silly because we don't allow it to be a hash and we've
specifically requested for a cipher.
This patch removes the bogus test.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the underlying algorithm has a block size other than 16 we abort
without freeing it. In fact, we try to return the algorithm itself
as an error!
This patch plugs the leak and makes it return -EINVAL instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The axbuf buffer is used by test_aead and therefore should be zeroed
there instead of in test_hash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is the x86-64 version of the Salsa20 stream cipher algorithm. The
original assembly code came from
<http://cr.yp.to/snuffle/salsa20/amd64-3/salsa20.s>. It has been
reformatted for clarity.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch contains the salsa20-i586 implementation. The original
assembly code came from
<http://cr.yp.to/snuffle/salsa20/x86-pm/salsa20.s>. I have reformatted
it (added indents) so that it matches the other algorithms in
arch/x86/crypto.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the rfc4106 wrapper for GCM just as we have an
rfc4309 wrapper for CCM. The purpose of the wrapper is to include part
of the IV in the key so that it can be negotiated by IPsec.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes chainiv avoid spinning by postponing requests on lock
contention if the user allows the use of asynchronous algorithms. If
a synchronous algorithm is requested then we behave as before.
This should improve IPsec performance on SMP when two CPUs attempt to
transmit over the same SA. Currently one of them will spin doing nothing
waiting for the other CPU to finish its encryption. This patch makes it
postpone the request and get on with other work.
If only one CPU is transmitting for a given SA, then we will process
the request synchronously as before.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that seqiv supports AEAD algorithms it needs to select the AEAD option.
Thanks to Erez Zadok for pointing out the problem.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It's better to return silently than crash and burn when someone feeds us
a zero length. In particular the null digest algorithm when used as part
of authenc will do that to us.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We need to allow setkey on digest_null if it is to be used directly by
authenc instead of through hmac.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a null blkcipher algorithm called ecb(cipher_null) for
backwards compatibility. Previously the null algorithm when used by
IPsec copied the data byte by byte. This new algorithm optimises that
to a straight memcpy which lets us better measure inherent overheads in
our IPsec code.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds 7 test vectors to tcrypt for CCM.
The test vectors are from rfc 3610.
There are about 10 more test vectors in RFC 3610
and 4 or 5 more in NIST. I can add these as time permits.
I also needed to set authsize. CCM has a prerequisite of
authsize.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds Counter with CBC-MAC (CCM) support.
RFC 3610 and NIST Special Publication 800-38C were referenced.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes crypto_alloc_aead always return algorithms that is
capable of generating their own IVs through givencrypt and givdecrypt.
All existing AEAD algorithms already do. New ones must either supply
their own or specify a generic IV generator with the geniv field.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for using seqiv with AEAD algorithms. This is
useful for those AEAD algorithms that performs authentication before
encryption because the IV generated by the underlying encryption algorithm
won't be available for authentication.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates the infrastructure to help the construction of IV
generator templates that wrap around AEAD algorithms by adding an IV
generator to them. This is useful for AEAD algorithms with no built-in
IV generator or to replace their built-in generator.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some algorithms always require manual IV construction. For instance,
the generic CCM algorithm requires the first byte of the IV to be manually
constructed. Such algorithms are always used by other algorithms equipped
with their own IV generators and do not need IV generation per se.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements the givencrypt function for authenc. It simply
calls the givencrypt operation on the underlying cipher instead of encrypt.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the underlying givcrypt operations for aead and associated
support elements. The rationale is identical to that of the skcipher
givcrypt operations, i.e., sometimes only the algorithm knows how the
IV should be generated.
A new request type aead_givcrypt_request is added which contains an
embedded aead_request structure with two new elements to support this
operation. The new elements are seq and giv. The seq field should
contain a strictly increasing 64-bit integer which may be used by
certain IV generators as an input value. The giv field will be used
to store the generated IV. It does not need to obey the alignment
requirements of the algorithm because it's not used during the operation.
The existing iv field must still be available as it will be used to store
intermediate IVs and the output IV if chaining is desired.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This generator generates an IV based on a sequence number by xoring it
with a salt. This algorithm is mainly useful for CTR and similar modes.
This patch also sets it as the default IV generator for ctr.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the gcm algorithm over to crypto_grab_skcipher
which is a prerequisite for IV generation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the gcm_base template which takes a block cipher
parameter instead of cipher. This allows the user to specify a
specific CTR implementation.
This also fixes a leak of the cipher algorithm that was previously
looked up but never freed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the authenc algorithm over to crypto_grab_skcipher
which is a prerequisite for IV generation.
This patch also changes authenc to set its ASYNC status depending on
the ASYNC status of the underlying skcipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes crypto_alloc_ablkcipher/crypto_grab_skcipher always
return algorithms that are capable of generating their own IVs through
givencrypt and givdecrypt. Each algorithm may specify its default IV
generator through the geniv field.
For algorithms that do not set the geniv field, the blkcipher layer will
pick a default. Currently it's chainiv for synchronous algorithms and
eseqiv for asynchronous algorithms. Note that if these wrappers do not
work on an algorithm then that algorithm must specify its own geniv or
it can't be used at all.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This generator generates an IV based on a sequence number by xoring it
with a salt and then encrypting it with the same key as used to encrypt
the plain text. This algorithm requires that the block size be equal
to the IV size. It is mainly useful for CBC.
It has one noteworthy property that for IPsec the IV happens to lie
just before the plain text so the IV generation simply increases the
number of encrypted blocks by one. Therefore the cost of this generator
is entirely dependent on the speed of the underlying cipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The chain IV generator is the one we've been using in the IPsec stack.
It simply starts out with a random IV, then uses the last block of each
encrypted packet's cipher text as the IV for the next packet.
It can only be used by synchronous ciphers since we have to make sure
that we don't start the encryption of the next packet until the last
one has completed.
It does have the advantage of using very little CPU time since it doesn't
have to generate anything at all.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates the infrastructure to help the construction of givcipher
templates that wrap around existing blkcipher/ablkcipher algorithms by adding
an IV generator to them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the underlying algorithm specifies a specific geniv algorithm then
we should use it for the cryptd version as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the geniv field which indicates the default IV
generator for each algorithm. It should point to a string that is not
freed as long as the algorithm is registered.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Different block cipher modes have different requirements for intialisation
vectors. For example, CBC can use a simple randomly generated IV while
modes such as CTR must use an IV generation mechanisms that give a stronger
guarantee on the lack of collisions. Furthermore, disk encryption modes
have their own IV generation algorithms.
Up until now IV generation has been left to the users of the symmetric
key cipher API. This is inconvenient as the number of block cipher modes
increase because the user needs to be aware of which mode is supposed to
be paired with which IV generation algorithm.
Therefore it makes sense to integrate the IV generation into the crypto
API. This patch takes the first step in that direction by creating two
new ablkcipher operations, givencrypt and givdecrypt that generates an
IV before performing the actual encryption or decryption.
The operations are currently not exposed to the user. That will be done
once the underlying functionality has actually been implemented.
It also creates the underlying givcipher type. Algorithms that directly
generate IVs would use it instead of ablkcipher. All other algorithms
(including all existing ones) would generate a givcipher algorithm upon
registration. This givcipher algorithm will be constructed from the geniv
string that's stored in every algorithm. That string will locate a template
which is instantiated by the blkcipher/ablkcipher algorithm in question to
give a givcipher algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Note: From now on the collective of ablkcipher/blkcipher/givcipher will
be known as skcipher, i.e., symmetric key cipher. The name blkcipher has
always been much of a misnomer since it supports stream ciphers too.
This patch adds the function crypto_grab_skcipher as a new way of getting
an ablkcipher spawn. The problem is that previously we did this in two
steps, first getting the algorithm and then calling crypto_init_spawn.
This meant that each spawn user had to be aware of what type and mask to
use for these two steps. This is difficult and also presents a problem
when the type/mask changes as they're about to be for IV generators.
The new interface does both steps together just like crypto_alloc_ablkcipher.
As a side-effect this also allows us to be stronger on type enforcement
for spawns. For now this is only done for ablkcipher but it's trivial
to extend for other types.
This patch also moves the type/mask logic for skcipher into the helpers
crypto_skcipher_type and crypto_skcipher_mask.
Finally this patch introduces the function crypto_require_sync to determine
whether the user is specifically requesting a sync algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the necessary changes for GCM to be used with async
ciphers. This would allow it to be used with hardware devices that
support CTR.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As discussed previously, this patch moves the basic CTR functionality
into a chainable algorithm called ctr. The IPsec-specific variant of
it is now placed on top with the name rfc3686.
So ctr(aes) gives a chainable cipher with IV size 16 while the IPsec
variant will be called rfc3686(ctr(aes)). This patch also adjusts
gcm accordingly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the impending addition of the givcipher type, both blkcipher and
ablkcipher algorithms will use it to create givcipher objects. As such
it no longer makes sense to split the system between ablkcipher and
blkcipher. In particular, both ablkcipher.c and blkcipher.c would need
to use the givcipher type which has to reside in ablkcipher.c since it
shares much code with it.
This patch merges the two Kconfig options as well as the modules into one.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes the request context alignment so that it is actually
aligned to the value required by the algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a new helper crypto_attr_alg_name which is basically the
first half of crypto_attr_alg. That is, it returns an algorithm name
parameter as a string without looking it up. The caller can then look it
up immediately or defer it until later.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
i get here:
----
LD vmlinux
SYSMAP System.map
SYSMAP .tmp_System.map
Building modules, stage 2.
MODPOST 226 modules
ERROR: "crypto_hash_type" [crypto/authenc.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
---
which fails because crypto_hash_type is declared in crypto/hash.c. You might wanna
fix it like so:
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a simple speed test for salsa20.
Usage: modprobe tcrypt mode=206
Signed-of-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add common compression tester function
Modify deflate test case to use the common compressor test function
Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is a large test vector for Salsa20 that crosses the 4096-bytes
page boundary.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes the multi-page processing bug that affects large test
vectors (the same bug that previously affected ctr.c).
There is an optimization for the case walk.nbytes == nbytes. Also we
now use crypto_xor() instead of adhoc XOR routines.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The abreq structure is currently allocated on the stack. This is broken
if the underlying algorithm is asynchronous. This patch changes it so
that it's taken from the private context instead which has been enlarged
accordingly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unfortunately the generic chaining hasn't been ported to all architectures
yet, and notably not s390. So this patch restores the chainging that we've
been using previously which does work everywhere.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The scatterwalk infrastructure is used by algorithms so it needs to
move out of crypto for future users that may live in drivers/crypto
or asm/*/crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch changes gcm/authenc to return EBADMSG instead of EINVAL for
ICV mismatches. This convention has already been adopted by IPsec.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto_aead convention for ICVs is to include it directly in the
output. If we decided to change this in future then we would make
the ICV (if the algorithm has an explicit one) available in the
request itself.
For now no algorithm needs this so this patch changes gcm to conform
to this convention. It also adjusts the tcrypt aead tests to take
this into account.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the gcm(aes) tests have to be taken together with all other
ciphers. This patch makes it available by itself at number 35.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The previous code incorrectly included the hash in the verification which
also meant that we'd crash and burn when it comes to actually verifying
the hash since we'd go past the end of the SG list.
This patch fixes that by subtracting authsize from cryptlen at the start.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Having enckeylen as a template parameter makes it a pain for hardware
devices that implement ciphers with many key sizes since each one would
have to be registered separately.
Since the authenc algorithm is mainly used for legacy purposes where its
key is going to be constructed out of two separate keys, we can in fact
embed this value into the key itself.
This patch does this by prepending an rtnetlink header to the key that
contains the encryption key length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is authsize is an algorithm paramter which cannot be changed at
run-time. This is inconvenient because hardware that implements such
algorithms would have to register each authsize that they support
separately.
Since authsize is a property common to all AEAD algorithms, we can add
a function setauthsize that sets it at run-time, just like setkey.
This patch does exactly that and also changes authenc so that authsize
is no longer a parameter of its template.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since alignment masks are always one less than a power of two, we can
use binary or to find their maximum.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
These utilities implemented in lib/hexdump.c are more handy, please use this.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add test vectors to tcrypt for AES in CBC mode for key sizes 192 and 256.
The test vectors are copied from NIST SP800-38A.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a large AES CTR mode test vector. The test vector is
4100 bytes in size. It was generated using a C++ program that called
Crypto++.
Note that this patch increases considerably the size of "struct
cipher_testvec" and hence the size of tcrypt.ko.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the number of entries in a cipher test vector template is
limited by TVMEMSIZE/sizeof(struct cipher_testvec). This patch
circumvents the problem by pointing cipher_tv to each entry in the
template, rather than the template itself.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the data spans across a page boundary, CTR may incorrectly process
a partial block in the middle because the blkcipher walking code may
supply partial blocks in the middle as long as the total length of the
supplied data is more than a block. CTR is supposed to return any unused
partial block in that case to the walker.
This patch fixes this by doing exactly that, returning partial blocks to
the walker unless we received less than a block-worth of data to start
with.
This also allows us to optimise the bulk of the processing since we no
longer have to worry about partial blocks until the very end.
Thanks to Tan Swee Heng for fixes and actually testing this :)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add GCM/GMAC support to cryptoapi.
GCM (Galois/Counter Mode) is an AEAD mode of operations for any block cipher
with a block size of 16. The typical example is AES-GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add AEAD support to tcrypt, needed by GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Analogously to camellia7 patch, move
"absorb kw2 to other subkeys" and "absorb kw4 to other subkeys"
code parts into camellia_setup_tail(). This further reduces
source and object code size at the cost of two brances
in key setup code.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move "key XOR is end of F-function" code part into
camellia_setup_tail(), it is sufficiently similar
between camellia_setup128 and camellia_setup256.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
unifies encrypt/decrypt routines for different key lengths.
This reduces module size by ~25%, with tiny (less than 1%)
speed impact.
Also collapses encrypt/decrypt into more readable
(visually shorter) form using macros.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unused macro params.
Use (u8)(expr) instead of (expr) & 0xff,
helps gcc to realize how to use simpler commands.
Move CAMELLIA_FLS macro closer to encrypt/decrypt routines.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces the custom xor in CBC with the generic crypto_xor.
It changes the operations for in-place encryption slightly to avoid
calling crypto_xor with tmpbuf since it is not necessarily aligned.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All common block ciphers have a block size that's a power of 2. In fact,
all of our block ciphers obey this rule.
If we require this then CBC can be optimised to avoid an expensive divide
on in-place decryption.
I've also changed the saving of the first IV in the in-place decryption
case to the last IV because that lets us use walk->iv (which is already
aligned) for the xor operation where alignment is required.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the addition of more stream ciphers we need to curb the proliferation
of ad-hoc xor functions. This patch creates a generic pair of functions,
crypto_inc and crypto_xor which does big-endian increment and exclusive or,
respectively.
For optimum performance, they both use u32 operations so alignment must be
as that of u32 even though the arguments are of type u8 *.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now we have ablkcipher algorithms have been identified as
type BLKCIPHER with the ASYNC bit set. This is suboptimal because
ablkcipher refers to two things. On the one hand it refers to the
top-level ablkcipher interface with requests. On the other hand it
refers to and algorithm type underneath.
As it is you cannot request a synchronous block cipher algorithm
with the ablkcipher interface on top. This is a problem because
we want to be able to eventually phase out the blkcipher top-level
interface.
This patch fixes this by making ABLKCIPHER its own type, just as
we have distinct types for HASH and DIGEST. The type it associated
with the algorithm implementation only.
Which top-level interface is used for synchronous block ciphers is
then determined by the mask that's used. If it's a specific mask
then the old blkcipher interface is given, otherwise we go with the
new ablkcipher interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the crypto scatterwalk code to use the generic
scatterlist chaining rather the version specific to crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Resubmitting this patch which extends sha256_generic.c to support SHA-224 as
described in FIPS 180-2 and RFC 3874. HMAC-SHA-224 as described in RFC4231
is then supported through the hmac interface.
Patch includes test vectors for SHA-224 and HMAC-SHA-224.
SHA-224 chould be chosen as a hash algorithm when 112 bits of security
strength is required.
Patch generated against the 2.6.24-rc1 kernel and tested against
2.6.24-rc1-git14 which includes fix for scatter gather implementation for HMAC.
Signed-off-by: Jonathan Lynch <jonathan.lynch@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
NO other block mode is M by default.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch exports four tables and the set_key() routine. This ressources
can be shared by other AES implementations (aes-x86_64 for instance).
The decryption key has been turned around (deckey[0] is the first piece
of the key instead of deckey[keylen+20]). The encrypt/decrypt functions
are looking now identical (except they are using different tables and
key).
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds countersize to CTR mode.
The template is now ctr(algo,noncesize,ivsize,countersize).
For example, ctr(aes,4,8,4) indicates the counterblock
will be composed of a salt/nonce that is 4 bytes, an iv
that is 8 bytes and the counter is 4 bytes.
When noncesize + ivsize < blocksize, CTR initializes the
last block - ivsize - noncesize portion of the block to
zero. Otherwise the counter block is composed of the IV
(and nonce if necessary).
If noncesize + ivsize == blocksize, then this indicates that
user is passing in entire counterblock. Thus countersize
indicates the amount of bytes in counterblock to use as
the counter for incrementing. CTR will increment counter
portion by 1, and begin encryption with that value.
Note that CTR assumes the counter portion of the block that
will be incremented is stored in big endian.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move huge unrolled pieces of code (3 screenfuls) at the end of
128/256 key setup routines into common camellia_setup_tail(),
convert it to loop there.
Loop is still unrolled six times, so performance hit is very small,
code size win is big.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Optimize GETU32 to use 4-byte memcpy (modern gcc will convert
such memcpy to single move instruction on i386).
Original GETU32 did four byte fetches, and shifted/XORed those.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rename some macros to shorter names: CAMELLIA_RR8 -> ROR8,
making it easier to understand that it is just a right rotation,
nothing camellia-specific in it.
CAMELLIA_SUBKEY_L() -> SUBKEY_L() - just shorter.
Move be32 <-> cpu conversions out of en/decrypt128/256 and into
camellia_en/decrypt - no reason to have that code duplicated twice.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move code blocks around so that related pieces are closer together:
e.g. CAMELLIA_ROUNDSM macro does not need to be separated
from the rest of the code by huge array of constants.
Remove unused macros (COPY4WORD, SWAP4WORD, XOR4WORD[2])
Drop SUBL(), SUBR() macros which only obscure things.
Same for CAMELLIA_SP1110() macro and KEY_TABLE_TYPE typedef.
Remove useless comments:
/* encryption */ -- well it's obvious enough already!
void camellia_encrypt128(...)
Combine swap with copying at the beginning/end of encrypt/decrypt.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently twofish cipher key setup code
has unrolled loops - approximately 70-100
instructions are repeated 40 times.
As a result, twofish module is the biggest module
in crypto/*.
Unrolling produces x2.5 more code (+18k on i386), and speeds up key
setup by 7%:
unrolled: twofish_setkey/sec: 41128
loop: twofish_setkey/sec: 38148
CALC_K256: ~100 insns each
CALC_K192: ~90 insns
CALC_K: ~70 insns
Attached patch removes this unrolling.
$ size */twofish_common.o
text data bss dec hex filename
37920 0 0 37920 9420 crypto.org/twofish_common.o
13209 0 0 13209 3399 crypto/twofish_common.o
Run tested (modprobe tcrypt reports ok). Please apply.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This three defines are used in all AES related hardware.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HIFN driver update to use DES weak key checks (exported in this patch).
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates include/crypto/des.h for common macros shared between
DES implementations.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements CTR mode for IPsec.
It is based off of RFC 3686.
Please note:
1. CTR turns a block cipher into a stream cipher.
Encryption is done in blocks, however the last block
may be a partial block.
A "counter block" is encrypted, creating a keystream
that is xor'ed with the plaintext. The counter portion
of the counter block is incremented after each block
of plaintext is encrypted.
Decryption is performed in same manner.
2. The CTR counterblock is composed of,
nonce + IV + counter
The size of the counterblock is equivalent to the
blocksize of the cipher.
sizeof(nonce) + sizeof(IV) + sizeof(counter) = blocksize
The CTR template requires the name of the cipher
algorithm, the sizeof the nonce, and the sizeof the iv.
ctr(cipher,sizeof_nonce,sizeof_iv)
So for example,
ctr(aes,4,8)
specifies the counterblock will be composed of 4 bytes
from a nonce, 8 bytes from the iv, and 4 bytes for counter
since aes has a blocksize of 16 bytes.
3. The counter portion of the counter block is stored
in big endian for conformance to rfc 3686.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is crypto_remove_spawn may try to unregister an instance which is
yet to be registered. This patch fixes this by checking whether the
instance has been registered before attempting to remove it.
It also removes a bogus cra_destroy check in crypto_register_instance as
1) it's outside the mutex;
2) we have a check in __crypto_register_alg already.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that newer versions of gcc have regressed in their abilities to
analyse initialisations. This patch moves the initialisations up to avoid
the warnings.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Not architecture specific code should not #include <asm/scatterlist.h>.
This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch moves the sg_init_table out of the timing loops for hash
algorithms so that it doesn't impact on the speed test results.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
hmac_setkey(), hmac_init(), and hmac_final() have
a singular on-stack scatterlist. Initialit is
using sg_init_one() instead of using sg_set_buf().
Signed-off-by: David S. Miller <davem@davemloft.net>
Crypto now uses SG helper functions. Fix hmac_digest to use those
functions correctly and fix the oops associated with it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.
Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Convert the subdirectory "crypto" to UTF-8. The files changed are
<crypto/fcrypt.c> and <crypto/api.c>.
Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
There are currently several SHA implementations that all define their own
initialization vectors and size values. Since this values are idential
move them to a header file under include/crypto.
Signed-off-by: Jan Glauber <jang@de.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Also remove the probe for sha1 in padlock's init code.
Quote from Herbert:
The probe is actually pointless since we can always probe when
the algorithm is actually used which does not lead to dead-locks
like this.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the helper blkcipher_walk_virt_block which is similar to
blkcipher_walk_virt but uses a supplied block size instead of the block
size of the block cipher. This is useful for CTR where the block size is
1 but we still want to walk by the block size of the underlying cipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the block size is no longer a multiple of the alignment, we need to
increase the kmalloc amount in blkcipher_next_slow to use the aligned block
size.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a comment to explain why we compare the cra_driver_name of
the algorithm being registered against the cra_name of a larval as opposed
to the cra_driver_name of the larval.
In fact larvals have only one name, cra_name which is the name that was
requested by the user. The test here is simply trying to find out whether
the algorithm being registered can or can not satisfy the larval.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Previously we assumed for convenience that the block size is a multiple of
the algorithm's required alignment. With the pending addition of CTR this
will no longer be the case as the block size will be 1 due to it being a
stream cipher. However, the alignment requirement will be that of the
underlying implementation which will most likely be greater than 1.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We do not allow spaces in algorithm names or parameters. Thanks to Joy Latten
for pointing this out.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As Joy Latten points out, inner algorithm parameters will miss the closing
bracket which will also cause the outer algorithm to terminate prematurely.
This patch fixes that also kills the WARN_ON if the number of parameters
exceed the maximum as that is a user error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
XTS currently considered to be the successor of the LRW mode by the IEEE1619
workgroup. LRW was discarded, because it was not secure if the encyption key
itself is encrypted with LRW.
XTS does not have this problem. The implementation is pretty straightforward,
a new function was added to gf128mul to handle GF(128) elements in ble format.
Four testvectors from the specification
http://grouper.ieee.org/groups/1619/email/pdf00086.pdf
were added, and they verify on my system.
Signed-off-by: Rik Snel <rsnel@cube.dyndns.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use max in blkcipher_get_spot() instead of open coding it.
Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When scatterwalk is built as a module digest.c was broken because it
requires the crypto_km_types structure which is in scatterwalk. This
patch removes the crypto_km_types structure by encoding the logic into
crypto_kmap_type directly.
In fact, this even saves a few bytes of code (not to mention the data
structure itself) on i386 which is about the only place where it's
needed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the authenc algorithm which constructs an AEAD algorithm
from an asynchronous block cipher and a hash. The construction is done
by concatenating the encrypted result from the cipher with the output
from the hash, as is used by the IPsec ESP protocol.
The authenc algorithm exists as a template with four parameters:
authenc(auth, authsize, enc, enckeylen).
The authentication algorithm, the authentication size (i.e., truncating
the output of the authentication algorithm), the encryption algorithm,
and the encryption key length. Both the size field and the key length
field are in bytes. For example, AES-128 with SHA1-HMAC would be
represented by
authenc(hmac(sha1), 12, cbc(aes), 16)
The key for the authenc algorithm is the concatenation of the keys for
the authentication algorithm with the encryption algorithm. For the
above example, if a key of length 36 bytes is given, then hmac(sha1)
would receive the first 20 bytes while the last 16 would be given to
cbc(aes).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the function scatterwalk_map_and_copy which reads or
writes a chunk of data from a scatterlist at a given offset. It will
be used by authenc which would read/write the authentication data at
the end of the cipher/plain text.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The scatterwalk code is only used by algorithms that can be built as
a module. Therefore we can move it into algapi.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since not everyone needs a queue pointer and those who need it can
always get it from the context anyway the queue pointer in the
common alg object is redundant.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch ensures that kernel.h and slab.h are included for
the setkey_unaligned function. It also breaks a couple of
long lines.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for having multiple parameters to
a template, separated by a comma. It also adds support
for integer parameters in addition to the current algorithm
parameter type.
This will be used by the authenc template which will have
four parameters: the authentication algorithm, the encryption
algorithm, the authentication size and the encryption key
length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds crypto_aead which is the interface for AEAD
(Authenticated Encryption with Associated Data) algorithms.
AEAD algorithms perform authentication and encryption in one
step. Traditionally users (such as IPsec) would use two
different crypto algorithms to perform these. With AEAD
this comes down to one algorithm and one operation.
Of course if traditional algorithms were used we'd still
be doing two operations underneath. However, real AEAD
algorithms may allow the underlying operations to be
optimised as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the SEED cipher (RFC4269).
This patch have been used in few VPN appliance vendors in Korea for
several years. And it was verified by KISA, who developed the
algorithm itself.
As its importance in Korean banking industry, it would be great
if linux incorporates the support.
Signed-off-by: Hye-Shik Chang <perky@FreeBSD.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Other options requiring specific block cipher algorithms already have
the appropriate select's.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix dma_wait_for_async_tx to not loop forever in the case where a
dependency chain is longer than two entries. This condition will not
happen with current in-kernel drivers, but fix it for future drivers.
Found-by: Saeed Bishara <saeed.bishara@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The previous patch had the conditional inverted. This patch fixes it
so that we return the original position if it does not straddle a page.
Thanks to Bob Gilligan for spotting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function blkcipher_get_spot tries to return a buffer of
the specified length that does not straddle a page. It has
an off-by-one bug so it may advance a page unnecessarily.
What's worse, one of its callers doesn't provide a buffer
that's sufficiently long for this operation.
This patch fixes both problems. Thanks to Bob Gilligan for
diagnosing this problem and providing a fix.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
setkey_unaligned() commited in ca7c39385c
overwrites unallocated memory in the following memset() because
I used the wrong buffer length.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andrew Morton:
[async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
slot for both src add dest and we get either corrupted data or a BUG.
Evgeniy Polyakov:
Btw, shouldn't it always be kmap_atomic() even if flag is not set.
That pages are usual one returned by alloc_page().
So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC flags.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simple and stupid - just use the same code from another place in the kernel.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies. It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations. Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources.
I imagine that any piece of ADMA hardware would register with the
'async_*' subsystem, and a call to async_X would be routed as
appropriate, or be run in-line. - Neil Brown
async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:
struct dma_async_tx_descriptor *
async_<operation>(..., struct dma_async_tx_descriptor *depend_tx,
dma_async_tx_callback cb_fn, void *cb_param)
{
struct dma_chan *chan = async_tx_find_channel(depend_tx, <operation>);
struct dma_device *device = chan ? chan->device : NULL;
int int_en = cb_fn ? 1 : 0;
struct dma_async_tx_descriptor *tx = device ?
device->device_prep_dma_<operation>(chan, len, int_en) : NULL;
if (tx) { /* run <operation> asynchronously */
...
tx->tx_set_dest(addr, tx, index);
...
tx->tx_set_src(addr, tx, index);
...
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run <operation> synchronously */
...
<operation>
...
async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
}
return tx;
}
async_tx_find_channel() returns a capable channel from its pool. The
channel pool is organized as a per-cpu array of channel pointers. The
async_tx_rebalance() routine is tasked with managing these arrays. In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities. For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor. In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel. A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.
Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api. A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit. With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.
On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement). For the other cases performance was roughly equal, +/- a few
percentage points. On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation. According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.
The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available
Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
fire at operation completion time and the callback will occur in a
tasklet. if the the channel does not support interrupts then a live
polling wait will be performed
* the api is written as a dmaengine client that requests all available
channels
* In support of dependencies the api implicitly schedules channel-switch
interrupts. The interrupt triggers the cleanup tasklet which causes
pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
xor routine. To the software routine the destination address is an implied
source, whereas engines treat it as a write-only destination. This patch
modifies the xor_blocks routine to take a an explicit destination address
to mirror the hardware.
Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts <= 1
* implicitly handle hardware concerns like channel switching and
interrupts, Neil Brown
* remove the per operation type list, and distribute operation capabilities
evenly amongst the available channels
* simplify async_tx_find_channel to optimize the fast path
* introduce the channel_table_initialized flag to prevent early calls to
the api
* reorganize the code to mimic crypto
* include mm.h as not all archs include it in dma-mapping.h
* make the Kconfig options non-user visible, Adrian Bunk
* move async_tx under crypto since it is meant as 'core' functionality, and
the two may share algorithms in the future
* move large inline functions into c files
* checkpatch.pl fixes
* gpl v2 only correction
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
The async_tx api tries to use a dma engine for an operation, but will fall
back to an optimized software routine otherwise. Xor support is
implemented using the raid5 xor routines. For organizational purposes this
routine is moved to a common area.
The following fixes are also made:
* rename xor_block => xor_blocks, suggested by Adrian Bunk
* ensure that xor.o initializes before md.o in the built-in case
* checkpatch.pl fixes
* mark calibrate_xor_blocks __init, Adrian Bunk
Cc: Adrian Bunk <bunk@stusta.de>
Cc: NeilBrown <neilb@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Evgeniy's hifn driver and probably mine don't use ablkcipher->queue at all.
The show method of ablkcipher will access this field without checking if it
is valid.
Signed-off-by: Sebastian Siewior <linux-crypto@ml.breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
setkey() in {cipher,blkcipher,ablkcipher,hash}.c does not respect the
requested alignment by the algorithm. This patch fixes it. The extra
memory is allocated by kmalloc() with GFP_ATOMIC flag.
Signed-off-by: Sebastian Siewior <linux-crypto@ml.breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Right now when a larval matures or when it dies of an error we
only wake up one waiter. This would cause other waiters to timeout
unnecessarily. This patch changes it to use complete_all to wake
up all waiters.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use menuconfigs instead of menus, so the whole menu can be disabled at once
instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>