Data Structures and Algorithms - Chapter 9: Hashing pot

54 592 1
Data Structures and Algorithms - Chapter 9: Hashing pot

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

1 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Chapter 9: Hashing • Basic concepts • Hash functions • Collision resolution • Open addressing • Linked list resolution • Bucket hashing 2 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Sequential search: O(n) Requiring several key comparisons • Binary search: O(log 2 n) before the target is found 3 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts 1,000,000500,000201,000,000 100,00050,00017100,000 10,0005,0001410,000 1,000500101,000 2561288256 5025650 168416 Sequential (Worst Case) Sequential (Average) BinarySize • Search complexity: 4 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Is there a search algorithm whose complexity is O(1)? 5 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Is there a search algorithm whose complexity is O(1)? YES. 6 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts memory addresses keys hashing Each key has only one address 7 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts John Adams100 Ray Black007 Vu Nguyen005 Sarah Trapp002 Harry Lee001 Key Address Vu Nguyen 102002 John Adams 107095 Sarah Trapp 111060 Hash Function 005 100 002 8 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Home address: address produced by a hash function. • Prime area: memory that contains all the home addresses. 9 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Synonyms: a set of keys that hash to the same location. • Collision: the location of the data to be inserted is already occupied by the synonym data. 10 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Basic Concepts • Ideal hashing: – No location collision – Compact address space [...]... Hoang Tru CSE Faculty - HCMUT 13 01 December 2008 Basic Concepts Searh for B hash(A) = 9 hash(B) = 9 hash(C) = 17 C [1] A B [5] [9] [17] Probing Cao Hoang Tru CSE Faculty - HCMUT 14 01 December 2008 Hash Functions • Direct hashing • Modulo division • Digit extraction • Mid-square • Folding • Rotation • Pseudo-random Cao Hoang Tru CSE Faculty - HCMUT 15 01 December 2008 Direct Hashing • The address... 64 65 66 → → → → → 26 36 46 56 66 Spreading the data more evenly across the address space Cao Hoang Tru CSE Faculty - HCMUT 25 01 December 2008 Pseudorandom Key Pseudorandom Number Generator Random Number Modulo Division Address y = ax + c For maximum efficiency, a and c should be prime numbers Cao Hoang Tru CSE Faculty - HCMUT 26 01 December 2008 Pseudorandom • Example: Key = 121267 Address a = 17 c=7... elements Cao Hoang Tru CSE Faculty - HCMUT 29 01 December 2008 Collision Resolution • As data are added and collisions are resolved, hashing tends to cause data to group within the list ⇒ Clustering: data are unevenly distributed across the list • High degree of clustering increases the number of probes to locate an element ⇒ Minimize clustering Cao Hoang Tru CSE Faculty - HCMUT 30 01 December 2008 Collision... Cao Hoang Tru CSE Faculty - HCMUT 18 01 December 2008 Digit Extraction Address = selected digits from Key • Example: 379452 121267 378845 160252 045128 Cao Hoang Tru CSE Faculty - HCMUT → → → → → 394 112 388 102 051 19 01 December 2008 Mid-square Address = middle digits of Key2 • Example: 9452 * 9452 = 89340304 → 3403 Cao Hoang Tru CSE Faculty - HCMUT 20 01 December 2008 Mid-square • Disadvantage: the... hash(C) = 17 A [1] [5] Cao Hoang Tru CSE Faculty - HCMUT [9] [17] 11 01 December 2008 Basic Concepts Insert A, B, C hash(A) = 9 hash(B) = 9 B and A collide at 9 hash(C) = 17 A [1] [5] B [9] [17] Collision Resolution Cao Hoang Tru CSE Faculty - HCMUT 12 01 December 2008 Basic Concepts Insert A, B, C hash(A) = 9 hash(B) = 9 B and A collide at 9 hash(C) = 17 C and B collide at 17 C [1] A B [5] [9] [17] Collision... = 2061546 MOD 307 + 1 = 41 + 1 = 42 Cao Hoang Tru CSE Faculty - HCMUT 27 01 December 2008 Collision Resolution • Except for the direct hashing, none of the others are one-to-one mapping ⇒ Requiring collision resolution methods • Each collision resolution method can be used independently with each hash function Cao Hoang Tru CSE Faculty - HCMUT 28 01 December 2008 Collision Resolution • A rule of thumb:... Hoang Tru CSE Faculty - HCMUT 16 01 December 2008 Direct Hashing • Advantage: there is no collision • Disadvantage: the address space (storage size) is as large as the key space Cao Hoang Tru CSE Faculty - HCMUT 17 01 December 2008 Modulo Division Address = Key MOD listSize + 1 • Fewer collisions if listSize is a prime number • Example: Numbering system to handle 1,000,000 employees Data space to store... resolution • Bucket hashing Cao Hoang Tru CSE Faculty - HCMUT 33 01 December 2008 Open Addressing • When a collision occurs, an unoccupied element is searched for placing the new element in Cao Hoang Tru CSE Faculty - HCMUT 34 01 December 2008 Open Addressing • Hash function: h: U → {0, …, m − 1} set of keys Cao Hoang Tru CSE Faculty - HCMUT addresses 35 01 December 2008 Open Addressing • Hash and probe function:... Resolution • Primary clustering: data become clustered around a home address Insert A9, B9, C9, D11, E12 A B C D E [1] Cao Hoang Tru CSE Faculty - HCMUT [9] [10] [11] [12] [13] 31 01 December 2008 Collision Resolution • Secondary clustering: data become grouped along a collision path throughout a list Insert A9, B9, C9, D11, E12, F9 A B D E [1] Cao Hoang Tru CSE Faculty - HCMUT C [9] [10] [11] [12] [13]... ⇒ 368 321 + 456 + 987 = 1764 ⇒ 764 Cao Hoang Tru CSE Faculty - HCMUT 23 01 December 2008 Rotation • Hashing keys that are identical except for the last character may create synonyms • The key is rotated before hashing original key rotated key 600101 600102 600103 600104 600105 160010 260010 360010 460010 560010 Cao Hoang Tru CSE Faculty - HCMUT 24 01 December 2008 Rotation • Used in combination with . Faculty - HCMUT Chapter 9: Hashing • Basic concepts • Hash functions • Collision resolution • Open addressing • Linked list resolution • Bucket hashing 2 01. Direct hashing • Modulo division • Digit extraction • Mid-square • Folding • Rotation • Pseudo-random 16 01 December 2008 Cao Hoang Tru CSE Faculty - HCMUT Direct

Ngày đăng: 06/03/2014, 17:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan