Skip to content
BY 4.0 license Open Access Published by De Gruyter February 14, 2024

Differential experiments using parallel alternative operations

  • Marco Calderini , Roberto Civino EMAIL logo and Riccardo Invernizzi

Abstract

The use of alternative operations in differential cryptanalysis, or alternative notions of differentials, is lately receiving increasing attention. Recently, Civino et al. managed to design a block cipher that is secure with respect to the classical differential cryptanalysis performed using XOR-differentials, but weaker with respect to the attack based on an alternative difference operation acting on the first s-box of the block. We extend this result to parallel alternative operations, i.e. acting on each s-box of the block. First, we recall the mathematical framework needed to define and use such operations. After that, we perform some differential experiments against a toy cipher and compare the effectiveness of the attack with respect to the one that uses XOR-differentials.

MSC 2010: 20B35; 94A60; 68P25

1 Introduction

Differential cryptanalysis is a powerful tool introduced in the beginning of the 90s to attack some cryptographic symmetric primitives, namely block ciphers [1]. The attack, which has later been generalised [24], is typically a chosen-plaintext attack that takes advantage of nonuniform relations between input differences and corresponding output differences.

To mitigate vulnerability to these attack methods, the cryptographic transformations employed within the substitution boxes (s-boxes) of the cipher should aim for the lowest possible level of differential uniformity [5] (for a comprehensive exploration of the differential uniformity of vectorial Boolean functions, readers can refer to Mesnager et al.’s survey [6]). It is essential to emphasize that the calculation of differential uniformity is based on the XOR operation. Indeed, in a traditional scenario of cryptanalysis of block ciphers, the difference operation classically taken into consideration by both designers and cryptanalysts is the one used to mix the key during the encryption process. In many cases, this operation is the bit-wise addition modulo two, i.e. the XOR. Nevertheless, it is worth noting that alternative types of operations may also be contemplated. For example, Berson introduces the modular difference to study the MD/SHA family of hash functions [7], and a similar method has been used [8] to cryptanalyze the block cipher PRESENT [9]. Borisov et al. [10] proposed a new type of differential known as multiplicative differential to attack IDEA [11]. This inspired the definition of c-differential uniformity [12], which has been extensively studied in the last couple of years, even if the cryptographic implication of such c -differential uniformity on attacking block ciphers remains a subject of ongoing debate [13]. In 2019, Civino et al. showed that a differential attack making use of alternative differences may be effective against XOR-based ciphers that are resistant to the classical differential attack [14]. More precisely, they designed a small-scale substitution-permutation network (SPN) inspired by the block cipher PRESENT, with five s-boxes of three bits each. They introduced a new sum on the whole message space that acts as the XOR on the last four s-boxes, while the first one matches with one of the alternative sums defined by Calderini et al. [15], coming from another elementary abelian regular group of translations (translation groups in short). Using such operation, they were able to mount a distinguishing attack on five rounds of the cipher. Moreover, they showed that this result cannot be obtained with the traditional differential approach, i.e. by looking at the distribution of classical differentials.

In this work, we show that this idea can be extended to the whole block, attacking all the s-boxes at the same time. The difference operator that we consider comes again from the family of alternative operations introduced by Calderini et al. [15]. Although it could be made more general without effort, for the ease of description, we focus here on the case of translation-based ciphers [16], which are the most common form of SPNs nowadays, where the encryption is realised by subsequent iterations of a non-linear s-box layer, a (usually) linear permutation, and an XOR-based key addition.

In order to guarantee deterministic propagation of differences through the diffusion layers of the ciphers, as happens in the classical scenario, we need to characterise the XOR-linear bijective maps that are also linear with respect to another (parallel) sum. As we will discuss later, the mentioned problem is of general interest in cryptography, and results in this direction will produce examples of XOR-based trapdoor ciphers for which a non-XOR distinguisher may exist. Unfortunately, only partial solutions to this problem are known [14,17,18]. Keeping our focus in this direction, after performing some computational experiments, we are able to design a toy cipher similar to the one suggested in the study by Civino et al. [14], with four parallel s-boxes of four bits each[1] together with an alternative parallel operation that can be used to attack it. A computer-aided direct check shows that the diffusion layer of the proposed cipher is a linear permutation with respect to both the operations + and , which is the pivotal condition for the success of the attack.

We show the results of a distinguishing attack for a number of rounds up to 17, concluding that differentials based on our alternative operation have much higher probabilities. Moreover, the difference in probability between alternative differentials and classical differentials that we obtain is higher than the one obtained in the study by Civino et al. [14], showing the effectiveness of our approach.

The article is organised as follows: in Section 2, we introduce the notation and describe the general setting for our attack; in Section 2.1, a brief summary on the construction of alternative operations coming from elementary abelian regular groups is given; in Section 3, we design a 16-bit cipher and a suitable parallel alternative operation, and perform experiments to study its resistance to differential cryptanalysis. We show the consistent improvement of our approach with respect to the classical one; and in Section 4, we conclude the article with the discussion of some open problems.

2 Preliminaries

Let V = ( F 2 ) n be a binary vector space, with canonical basis e 1 , , e n , which will represent the plaintext-ciphertext space of an n -bit block cipher. We denote by

T { σ a a V , σ a : x x + a } < Sym ( V )

the group of translations on V . We stress that the action of this translation group on the message space V represents the XOR-based key-addition layer in a block cipher. Let us also note that T as a subgroup of the symmetric group Sym ( V ) is elementary, abelian, and regular. We recall here the definition of regularity.

Definition 1

A permutation group G acting transitively on a set V is said to be regular if, for all v V , the stabilizer of G at v G v { g G v g = v } is trivial.

It is well known that any other elementary abelian regular subgroup of Sym ( V ) is conjugated to T .

Theorem 1

[19] Let T < Sym ( V ) be an elementary abelian regular subgroup. Then, there exists g Sym ( V ) such that T = T g = g 1 T g .

Let us now show how to define another operation on V starting from another elementary abelian regular group of translations.

Definition 2

Let T < Sym ( V ) be an elementary abelian regular group. Let us define an additive group operation on V by letting for each a and b in V

a b a τ b ,

where τ b is the unique element of T sending 0 to b .

Proposition 1

If is defined as above, then ( V , ) is a vector space over F 2 , with associated translation group T = T . Moreover, ( V , ) ( V , + ) .

The subspaces introduced below are essential to understand the structure of alternative operations coming from translation groups. We refer to Civino et al. for a more detailed discussion [14].

Definition 3

Given an operation as above, a vector k V is called a weak key if, for each x V , it holds x + k = x k . The set

W { k k V , k is a weak key }

is called the weak-key space, and is a subspace of both ( V , + ) and ( V , ) . We denote by d 1 its dimension. Moreover, let us define a dot product on V such that for each a , b V ,

a b a + b + a b .

The set of elements that can be expressed as dot products is denoted by

U { x y x , y V }

and is called the set of errors.

Finally, denoting by GL ( V , + ) and GL ( V , ) the groups of linear permutations with respect, + and , respectively, we define

H GL ( V , + ) GL ( V , ) .

We now briefly present the impact of an alternative sum on the differential cryptanalysis of SPNs. The classical differential attack relies on the property that each XOR-difference is maintained the same after the key is XOR-ed to the state. This is not the case when considering -differences. Indeed, let us consider two inputs with difference Δ , denoted by x and x Δ . After the key addition, the difference becomes

( x + k ) ( ( x Δ ) + k ) Δ .

However, it can be shown [14] that if T < AGL ( V , ) and T < AGL ( V , + ) , then

(1) Δ = Δ + k Δ ,

i.e. in a particular setting, the output difference after the key-addition layer can be expressed in terms of the dot product introduced in Definition 3. By definition, k Δ belongs to U , and therefore, the number of possible output differences is bounded by U . The presence of the error in equation (1), of course, forces us to consider -differential probabilities introduced by the key-addition layer, unlike in the classical case, yielding a disadvantage in terms of the final probability of the differential propagation.

On the other hand, the s-box is usually designed to have the lowest possible differential uniformity with respect to the XOR. This may no longer be true with respect to the operation . A higher differential uniformity creates trails with higher probability for and counterbalances the effect of differential probability introduced by the key addition.

Finally, and more importantly, the diffusion layer λ of an SPN is usually an XOR-linear map. In order to mount a successful -differential attack, we need λ to be -linear as well, i.e. λ H . Otherwise, λ would be a -non-linear map and the effect of block-sized differential probabilities introduced by the diffusion layer would make the approach completely ineffective. This represents a strong motivation to the study of H .

We are now ready to show explicitly how operations coming from new translation groups are constructed.

2.1 Construction of alternative operations

For the reason explained above (equation (1)), it is convenient to consider operations on V coming from a translation group T < Sym ( V ) such that T < AGL ( V , ) and T < AGL ( V , + ) , which is the setting in which we will assume to be from now on. We will make use of the construction of such operations as presented in the study by Calderini et al. [15], but we will omit here many of the details, that the interested reader can find in the cited article.

Recall that we denote n = dim ( V ) and that we have 1 d = dim ( W ) n 2 [15]. We will focus on the particular case d = n 2 . The reason for this is that the case when the dimension of the weak-key space reaches its upper bound is one of those in which the structure of H is known (Theorem 2). Thanks to Calderini et al. [15, Theorem 3.9], we may assume, up to conjugation, that W is spanned by { e 3 , , e n } . In this setting, from Calderini et al. [15, Theorem 3.11] (but see also Civino et al. [14, Theorem 3.3]), we have

a e i = a τ e i = a M e i + e i ,

where

M e 1 = 1 2 0 b 0 n 2 , 2 1 n 2 , M e 2 = 1 2 b 0 0 n 2 , 2 1 n 2 ,

and M e j = 1 n for j 3 , where 1 k denotes the identity matrix of size k × k and 0 k , is the zero matrix of size k × . The element b is a non-zero vector in ( F 2 ) n 2 , which completely determines . Once the operation is defined on the basis, it is easy to compute a b , for a , b V .

Let r and s be two positive integers, and we will denote by ( F 2 ) r × s the set of matrices of dimension r × s . The following result is due to Civino et al. [14, Theorem 5.3] and characterises H in the case d = n 2 .

Theorem 2

Let b ( F 2 ) n 2 be as above and λ ( F 2 ) n × n . The following are equivalent:

  • λ H ;

  • there exist A GL ( ( F 2 ) 2 , + ) , D GL ( ( F 2 ) d , + ) , and B ( F 2 ) 2 × d such that

    λ = A B 0 d , 2 D

    and b D = b .

3 Experiments on a 16-bit block cipher with 4-bit s-boxes

As anticipated, the idea of this work is to design an SPN that is weak with respect to a differential attack based on an alternative parallel operation for which it is possible to show that the diffusion layer of the cipher belongs to H . We start by explaining explicitly what we mean by parallel: letting V = V 1 V m , with V i ( F 2 ) n for i = 1 , , m , and x V , we can split x into m vectors x 1 , , x m of n components each, and we can assume that the target SPN acting on a space of m × n bits contains m s-boxes S 1 , , S m such that S 1 acts on x 1 , S 2 on x 2 , and so on. For this reason, we aim to mount an alternative differential attack using a sum acting as

x 1 x m y 1 y m = x 1 1 y 1 x m m y m ,

where the sum i acts on V i . As explained previously, the feasibility of the attack relies on an extension of Theorem 2 to parallel sums.

In the absence of a general result in this sense, we have restricted our attention to the case where W i = { k k V i is a weak key } has dimension n 2 , for i = 1 , , m , and we have performed some computational experiments using Magma [20], which we describe below.

3.1 The target cipher and its trapdoor

Fixing V = ( F 2 ) 16 , n = 4 , and m = 4 and letting be the parallel sum defined by applying each 4-bit block the alternative operation 4 defined by the vector b = ( 0 , 1 ) (see Section 2.1), we could check using Magma that the diffusion layer λ defined in Figure 1 belongs to H , i.e. it is a permutation that is linear with respect to to both + and . Note that the mentioned matrix, which will be chosen as the diffusion layer of the target SPN, is obtained from the cyclic shift of two 4 × 4 binary sub-matrices. For the benefit of the reader, we display the Cayley table of the 4-bit operation 4 induced by the vector b = ( 0 , 1 ) in Figure 2. The entries in which a 4 b differs from a + b are highlighted.

Figure 1 
                  The chosen diffusion layer.
Figure 1

The chosen diffusion layer.

Figure 2 
                  Cayley table of 
                        
                           
                           
                              
                                 
                                    
                                       ∘
                                    
                                 
                                 
                                    4
                                 
                              
                           
                           {\circ }_{4}
                        
                     .
Figure 2

Cayley table of 4 .

The target cipher then features the 4-bit permutation γ : ( F 2 ) 4 ( F 2 ) 4 defined in Figure 3 as its s-box.

Figure 3 
                  The chosen s-box 
                        
                           
                           
                              γ
                           
                           \gamma 
                        
                     .
Figure 3

The chosen s-box γ .

Here, each vector is interpreted as a binary number, with most significant bit first. Precisely, four copies of γ will act in a parallel way on the 16-bit block. Note that the s-box γ is optimal according to Leander and Poschmann [21]. By computing the difference distribution table (DDT) of γ with respect to XOR, we obtain the result displayed in Figure 4. As it is known, γ is differentially 4-uniform, which is the best result for a permutation over ( F 2 ) 4 (see, e.g. Leander and Poschmann [21]).

Figure 4 
                  DDT of 
                        
                           
                           
                              γ
                           
                           \gamma 
                        
                      with respect to +.
Figure 4

DDT of γ with respect to +.

However, if we compute the DDT using our new operation 4 as difference operator, we obtain the result displayed in Figure 5.

Figure 5 
                  DDT of 
                        
                           
                           
                              γ
                           
                           \gamma 
                        
                      with respect to 
                        
                           
                           
                              
                                 ∘
                              
                           
                           \circ 
                        
                     .
Figure 5

DDT of γ with respect to .

We can note that γ turns out to be differentially 16-uniform with respect to 4 ; in particular, when the input difference is 7 x , the output difference becomes 6 x with probability 1. Beside this, it is clear from the table that the differential behaviour of the s-box is completely different when the alternative operation is considered and the map looks far away from being non-linear as necessary.

In our experiments described in the following section, we consider the SPN whose i th round is obtained by the composition of the parallel application of the s-box γ on every 4-bit block, of the diffusion layer λ defined above, and of the XOR with the i th round key.

3.2 Brute-forcing differentials

We study the difference propagation in the cipher in a long-key scenario, i.e. the key-schedule selects a random long key k F 2 16 r , where r is the number of rounds. In order to mitigate the possible bias due to a particular key choice, we run our experiments by taking the average over 2 15 random long-key generations. This approach will provide us with a good estimate of the expected differential probability of the best differentials on this cipher. The experimental computations, carried out by brute-forcing all the possible differentials, show that the best i -round differential for the classical XOR difference is always less likely than the best i -round differential computed using the mentioned parallel operation, for i = 1 , , 17 . The results, round per round, are displayed in Figure 6. In particular, when i = 17 , the best + -differential is 0060 x 0700 x with probability 2 14.993 , while using the difference associated to b = ( 0 , 1 ) , the best 17-round -differential is 0070 x 0600 x with probability 2 14.411 .

Figure 6 
                  Best +-differential probability vs best 
                        
                           
                           
                              
                                 ∘
                              
                           
                           \circ 
                        
                     -differential probability.
Figure 6

Best +-differential probability vs best -differential probability.

Computational evidence shows that similar results, even with a faster diffusion, can be obtained by choosing the diffusion layer of the cipher as a random matrix of H . This suggests that, in principle, every matrix of H could represent a trapdoor diffusion layer for the cipher, with respect to a differential distinguishing attack that exploits the knowledge of the operation .

4 Open problems

In this article, we have demonstrated that when the diffusion layer of an SPN exhibits linearity not only with respect to the XOR operation, as traditionally expected, but also in relation to an alternative operation stemming from a different translation group, this particular characteristic can be leveraged by a cryptanalyst to carry out a distinguishing attack employing alternative differentials. However, it can be quite challenging to discern which maps meet this criterion. What we require is an extension of Theorem 2 to the case of parallel operations, enabling the simultaneous targeting of all the s-boxes within the cipher and taking advantage of the lower non-linearity of the confusion layer. One potential approach to address this issue might involve attempting to represent the linear layer in a manner akin to the blocks demonstrated in Theorem 2. Based on our empirical findings, we offer the following hypothesis:

Conjecture 1

Let V = V 1 V m , with V i ( F 2 ) n for i = 1 , , m , and let = ( 1 , , m ) be a parallel alternative operation as in Section 3, with dim ( W i ) = n 2 for all i . Then, the cardinality of H is at least

m 3 m ! 3 2 3 n 6 h = 0 n 4 ( 2 n 3 2 h ) [ ( m 2 m ) 2 n 2 5 n + 6 1 ] .

This illustrates that H may possess a sufficient size to contain matrices that appear to function as effective diffusion layers but, in reality, conceal trapdoor vulnerabilities.

Another crucial concern is the elimination of the assumption d = n 2 , as this would enable us to consider a broader range of operations. Nevertheless, as of the present writing, we are unaware of the existence of a more comprehensive version of Theorem 2 that eliminates the condition d = n 2 . Consequently, the prospect of an extension to the parallel case remains unknown.

In conclusion, it is evident that the influence of this approach on differential probabilities is intrinsically linked to the cipher’s unique attributes. Computational evidence underscores the fact that even a minor modification in the design, such as altering the s-box or the diffusion layer, can have a profound influence on the resultant probabilities and outcomes. This heightened sensitivity to design specifics poses a challenge when attempting to establish general conjectures that can be universally applicable to different ciphers. In addition, it is important to note that these resulting probabilities are heavily contingent on the fixed alternative operations. Notably, within F 2 16 , a vast number of approximately 2 27 potential parallel alternative operations can be considered, working over 4-bit blocks.

A final thought to consider is the observation that, as we have demonstrated, alternative operations have the potential to diminish the resistance of an s-box to differential cryptanalysis. This is further exemplified by the fact that a 4-bit permutation, which is considered optimal (according to the criteria in the study by Leander et al. [21]), exhibits the lowest possible differential uniformity when coupled with the operation defined in Figure 2. This raises an interesting open problem: the complete analysis of differential properties concerning alternative operations of various s-boxes, akin to what has been explored for the 4-bit permutations with respect to modular addition [22]. Even when focusing on small dimensions like 4-bit permutations, this undertaking requires some efforts. It is important to note that, in this context, there are 106 possible operations available (as detailed in the study by Calderini et al. [15, Table 1]), including the XOR. Moreover, within the same affine-equivalence class of a given s-box, different functions may exhibit varying behaviour with respect to a fixed alternative operation.

We believe that the experimental results of this article show why the mentioned problems can be of interest in this area of research in cryptanalysis.

Acknowledgements

This work has been accepted for presentation at CIFRIS23, the Congress of the Italian association of cryptography “De Componendis Cifris.” M. Calderini and R. Civino are members of INdAM-GNSAGA (Italy).

  1. Funding information: R. Civino is funded by the Centre of Excellence ExEMERGE at the University of L’Aquila.

  2. Conflict of interest: The authors state that there is no conflict of interest.

References

[1] Biham E, Shamir A. Differential cryptanalysis of DES-like cryptosystems. J Cryptol. 1991;4:3–72. 10.1007/BF00630563Search in Google Scholar

[2] Biham E, Biryukov A, Shamir A. Cryptanalysis of Skipjack reduced to 31 rounds using impossible differentials. J Cryptol. 2005;18:291–311. 10.1007/s00145-005-0129-3Search in Google Scholar

[3] Knudsen LR. Truncated and higher order differentials. In: Fast Software Encryption: Second International Workshop Leuven, Belgium, December 14–16, 1994 Proceedings 2. Springer; 1995. p. 196–211. 10.1007/3-540-60590-8_16Search in Google Scholar

[4] Wagner D. The boomerang attack. In: International Workshop on Fast Software Encryption. Springer; 1999. p. 156–70. 10.1007/3-540-48519-8_12Search in Google Scholar

[5] Nyberg K. Differentially uniform mappings for cryptography. In: Workshop on the Theory and Application of of Cryptographic Techniques. Springer; 1993. p. 55–64. 10.1007/3-540-48285-7_6Search in Google Scholar

[6] Mesnager S, Mandal B, Msahli M. Survey on recent trends towards generalized differential and boomerang uniformities. Cryptogr Commun. 2022;14:691–735.10.1007/s12095-021-00551-6Search in Google Scholar

[7] Berson TA. Differential cryptanalysis mod 232 with applications to MD5. In: Advances in Cryptology–EUROCRYPT’ 92. EUROCRYPT 1992. Lecture Notes in Computer Science, vol. 658. Springer, Berlin, Heidelberg; 1993.Search in Google Scholar

[8] Abazari F, Sadeghian B. Cryptanalysis with ternary difference: applied to block cipher PRESENT. Cryptology ePrint Archive. 2011. 10.7763/IJIEE.2012.V2.133Search in Google Scholar

[9] Bogdanov A, Knudsen LR, Leander G, Paar C, Poschmann A, Robshaw MJ, et al. PRESENT: an ultra-lightweight block cipher. In: Cryptographic Hardware and Embedded Systems-CHES 2007: 9th International Workshop, Vienna, Austria, September 10–13, 2007. Proceedings 9. Springer; 2007. p. 450–66. 10.1007/978-3-540-74735-2_31Search in Google Scholar

[10] Borisov N, Chew M, Johnson R, Wagner D. Multiplicative differentials. In: Fast Software Encryption: 9th International Workshop, FSE 2002 Leuven, Belgium, February 4–6, 2002 Revised Papers 9. Springer; 2002. p. 17–33. 10.1007/3-540-45661-9_2Search in Google Scholar

[11] Lai X, Massey JL. A proposal for a new block encryption standard. In: Advances in Cryptology–EUROCRYPT’90: Workshop on the Theory and Application of Cryptographic Techniques Aarhus, Denmark, May 21–24, 1990 Proceedings 9. Springer; 1991. p. 389–404. Search in Google Scholar

[12] Ellingsen P, Felke P, Riera C, Stǎanicǎ P, Tkachenko A. C-differentials, multiplicative uniformity, and (almost) perfect c-nonlinearity. IEEE Trans Inform Theory. 2020;66(9):5781–9. 10.1109/TIT.2020.2971988Search in Google Scholar

[13] Bartoli D, Kölsch L, Micheli G. Differential biases, c -differential uniformity, and their relation to differential attacks. 2022. arXiv: http://arXiv.org/abs/arXiv:220803884. Search in Google Scholar

[14] Civino R, Blondeau C, Sala M. Differential attacks: using alternative operations. Designs Codes Cryptography. 2019;87:225–47. 10.1007/s10623-018-0516-zSearch in Google Scholar

[15] Calderini M, Civino R, Sala M. On properties of translation groups in the affine general linear group with applications to cryptography. J Algebra. 2021;569:658–80. 10.1016/j.jalgebra.2020.10.034Search in Google Scholar

[16] Caranti A, Dalla Volta F, Sala M. On some block ciphers and imprimitive groups. Appl Algebra Eng Commun Comput. 2009;20(5-6):339–50. 10.1007/s00200-009-0100-xSearch in Google Scholar

[17] Brunetta C, Calderini M, Sala M. On hidden sums compatible with a given block cipher diffusion layer. Discrete Math. 2019;342(2):373–86. 10.1016/j.disc.2018.10.003Search in Google Scholar

[18] Aragona R, Civino R, Gavioli N, Scoppola CM. Regular subgroups with large intersection. Annali di Matematica Pura ed Applicata. 2019;198(6):2043–57. 10.1007/s10231-019-00853-wSearch in Google Scholar

[19] Dixon JD. Maximal abelian subgroups of the symmetric groups. Canadian J Math. 1971;23(3):426–38. 10.4153/CJM-1971-045-7Search in Google Scholar

[20] Bosma W, Cannon J, Playoust C. The Magma algebra system I: the user language. J Symbolic Comput. 1997;24(3–4):235–65. 10.1006/jsco.1996.0125Search in Google Scholar

[21] Leander G, Poschmann A. On the classification of 4 bit S-boxes. In: Arithmetic of Finite Fields: First International Workshop, WAIFI 2007, Madrid, Spain, June 21–22, 2007. Proceedings 1. Springer; 2007. p. 159–76. Search in Google Scholar

[22] Zajac P, Jókay M. Cryptographic properties of small bijective S-boxes with respect to modular addition. Cryptography Commun. 2020;12:947–63. 10.1007/s12095-020-00447-xSearch in Google Scholar

Received: 2023-09-06
Revised: 2023-10-15
Accepted: 2023-10-20
Published Online: 2024-02-14

© 2024 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 18.11.2024 from https://www.degruyter.com/document/doi/10.1515/jmc-2023-0030/html
Scroll to top button