Hacking ebook nostarch sampler learnyousomecode

40 18 0
  • Loading ...

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Tài liệu liên quan

Thông tin tài liệu

Ngày đăng: 05/11/2019, 21:33

More From No Starch Press Read sample chapters from th e s e N o S ta r c h b o o k s ! I mp r a c t i c a l P y t h o n p r o j e c t s Lee Vaughan • Serious Python Julien Danjou • M at h A dv e n t u r e s w i t h p y t h o n Peter Farrell • practical sql Anthony DeBarros • the rust programming language Steve Klabnik and Carol Nichols, with contributions from the rust community • p r a c t i c a l b i n a ry a n a ly s i s Dennis Andriesse • mission python Sean McManus • c o di n g w i t h m i n e c r a f t Al Sweigart Br e e ding Gi a n t R at s w it h Gene tic Algorithms Genetic algorithms are general-purpose ­optimization programs designed to solve complex problems Invented in the 1970s, they belong to the class of evolutionary algorithms, so named because they mimic the Darwinian process of natural selection They are especially useful when little is known about a problem, when you’re dealing with a nonlinear problem, or when searching for bruteforce-type solutions in a large search space Best of all, they are easy algorithms to grasp and implement In this chapter, you’ll use genetic algorithms to breed a race of superrats that can terrorize the world After that, you’ll switch sides and help James Bond crack a high-tech safe in a matter of seconds These two projects should give you a good appreciation for the mechanics and power of genetic algorithms Finding the Best of All Possible Solutions Genetic algorithms optimize, which means that they select the best solution (with regard to some criteria) from a set of available alternatives For example, if you’re looking for the fastest route to drive from New York to Los Angeles, a genetic algorithm will never suggest you fly It can choose only from within an allowed set of conditions that you provide As ­optimizers, these algorithms are faster than traditional methods and can avoid premature convergence to a suboptimal answer In other words, they efficiently search the solution space yet so thoroughly enough to avoid picking a good answer when a better one is available Unlike exhaustive search engines, which use pure brute force, genetic algorithms don’t try every possible solution Instead, they continuously grade solutions and then use them to make “informed guesses” going forward A simple example is the “warmer-colder” game, where you search for a hidden item as someone tells you whether you are getting warmer or colder based on your proximity or search direction Genetic algorithms use a fitness function, analogous to natural selection, to discard “colder” solutions and build on the “warmer” ones The basic process is as follows: Randomly generate a population of solutions Measure the fitness of each solution Select the best (warmest) solutions and discard the rest Cross over (recombine) elements in the best solutions to make new solutions Mutate a small number of elements in the solutions by changing their value Return to step and repeat The select–cross over–mutate loop continues until it reaches a stop ­condition, like finding a known answer, finding a “good enough” answer (based on a minimum threshold), completing a set number of iterations, or reaching a time deadline Because these steps closely resemble the p ­ rocess of evolution, complete with survival of the fittest, the terminology used with genetic algorithms is often more biological than computational Project #13: Breeding an Army of Super-Rats Here’s your chance to be a mad scientist with a secret lab full of boiling beakers, bubbling test tubes, and machines that go “BZZZTTT.” So pull on some black rubber gloves and get busy turning 100 nimble trash-eating scavengers into massive man-eating monsters 126   Chapter The Objec ti v e Use a genetic algorithm to simulate breeding rats to an average weight of 110 pounds Strategy Your dream is to breed a race of rats the size of bullmastiffs (we’ve already established that you’re mad) You’ll start with Rattus norvegicus, the brown rat, then add some artificial sweeteners, some atomic radiation from the 1950s, a lot of patience, and a pinch of Python, but no genetic engineering— you’re old-school, baby! The rats will grow from less than a pound to a terrifying 110 pounds, about the size of a female bullmastiff (see Figure 7-1) Figure 7-1: Size comparison of a brown rat, a female bullmastiff, and a human Before you embark on such a huge undertaking, it’s prudent to simulate the results in Python And you’ve drawn up something better than a plan—you’ve drawn some graphical pseudocode (see Figure 7-2) Breeding Giant Rats with Genetic Algorithms   127 Mode Min Max Populate: Establish initial population and range of weights Loop Grade: Evaluate fitness by comparing mean population weight to a target weight Select: Cull smallest males and females Min Max Breed: Repopulate with random weights based on weight range of the selected rats Mutate: Randomly alter weights on a few rats Most outcomes reduce weight Figure 7-2: Genetic algorithm approach to breeding super-rats The process shown in Figure 7-2 outlines how a genetic algorithm works Your goal is to produce a population of rats with an average weight of 110 pounds from an initial population weighing much less than that Going forward, each population (or generation) of rats represents a candidate solution to the problem Like any animal breeder, you cull undesirable males and females, which you humanely send to—for you Austin Powers fans—an evil petting zoo You then mate and breed the remaining rats, a process known as crossover in genetic programming 128   Chapter The offspring of the remaining rats will be essentially the same size as their parents, so you need to mutate a few While mutation is rare and usually results in a neutral-to-nonbeneficial trait (low weight, in this case), sometimes you’ll successfully produce a bigger rat The whole process then becomes a big repeating loop, whether done organically or programmatically, making me wonder whether we really are just virtual beings in an alien simulation At any rate, the end of the loop— the stop condition—is when the rats reach the desired size or you just can’t stand dealing with rats anymore For input to your simulation, you’ll need some statistics Use the metric system since you’re a scientist, mad or not You already know that the average weight of a female bullmastiff is 50,000 grams, and you can find useful rat statistics in Table 7-1 Table 7-1: Brown Rat Weight and Breeding Statistics Parameter Published values Minimum weight 200 grams Average weight (female) 250 grams Average weight (male) 300–350 grams Maximum weight 600 grams* Number of pups per litter 8–12 Litters per year 4–13 Life span (wild, captivity) 1–3 years, 4–6 years *Exceptional individuals may reach 1,000 grams in captivity Because both domestic and wild brown rats exist, there may be wide variation in some of the stats Rats in captivity tend to be better cared for than wild rats, so they weigh more, breed more, and have more pups So you can choose from the higher end when a range is available For this project, start with the assumptions in Table 7-2 Table 7-2: Input Assumptions for the Super-Rats Genetic Algorithm Variable and value Comments GOAL = 50000 Target weight in grams (female bullmastiff) NUM_RATS = 20 Total number of adult rats your lab can support INITIAL_MIN_WT = 200 Minimum weight of adult rat, in grams, in initial population INITIAL_MAX_WT = 600 Maximum weight of adult rat, in grams, in initial population INITIAL_MODE_WT = 300 Most common adult rat weight, in grams, in initial population MUTATE_ODDS = 0.01 Probability of a mutation occurring in a rat MUTATE_MIN = 0.5 Scalar on rat weight of least beneficial mutation MUTATE_MAX = 1.2 Scalar on rat weight of most beneficial mutation LITTER_SIZE = Number of pups per pair of mating rats LITTERS_PER_YEAR = 10 Number of litters per year per pair of mating rats GENERATION_LIMIT = 500 Generational cutoff to stop breeding program Breeding Giant Rats with Genetic Algorithms   129 10 P R I NC I P L E S O F DYNAMIC TAINT ANALYSIS Imagine that you’re a hydrologist who wants to trace the flow of a river that runs partly underground You already know where the river goes underground, but you want to find out whether and where it emerges One way to solve this problem is to color the river’s water using a special dye and then look for locations where the colored water reappears The topic of this chapter, dynamic taint analysis (DTA), applies the same idea to binary programs Similar to coloring and tracing the flow of water, you can use DTA to color, or taint, selected data in a program’s memory and then dynamically track the data flow of the tainted bytes to see which program locations they affect In this chapter, you’ll learn the principles of dynamic taint analysis DTA is a complex technique, so it’s important to be familiar with its inner workings to build effective DTA tools In Chapter 11, I’ll introduce you to libdft, an open source DTA library, which we’ll use to build several practical DTA tools 10.1 What Is DTA? Dynamic taint analysis (DTA), also called data flow tracking (DFT), taint tracking, or simply taint analysis, is a program analysis technique that allows you to determine the influence that selected program state has on other parts of the program state For instance, you can taint any data that a program receives from the network, track that data, and raise an alert if it affects the program counter, which can indicate a control-flow hijacking attack In the context of binary analysis, DTA is typically implemented on top of a dynamic binary instrumentation platform such as Pin, which we discussed in Chapter To track the flow of data, DTA instruments all instructions that handle data, either in registers or in memory In practice, this includes nearly all instructions, which means that DTA leads to very high performance overhead on instrumented programs Slowdowns of 10× or more are not uncommon, even in optimized DTA implementations While a 10× overhead may be acceptable during security tests of a web server, for instance, it usually isn’t okay in production This is why you’ll typically use DTA only for offline analysis of programs You can also base taint analysis systems on static instrumentation instead of dynamic instrumentation, inserting the necessary taint analysis logic at compile time rather than at runtime While that approach usually results in better performance, it also requires source code Since our focus is binary analysis, we’ll stick to dynamic taint analysis in this book As mentioned, DTA allows you to track the influence of selected program state on interesting program locations Let’s take a closer look at the details of what this means: how you define interesting state or locations, and what exactly does it mean for one part of the state to “influence” another? 10.2 DTA in Three Steps: Taint Sources, Taint Sinks, and Taint Propagation At a high level, taint analysis involves three steps: defining taint sources, defining taint sinks, and tracking taint propagation If you’re developing a tool based on DTA, the first two steps (defining taint sources and sinks) are up to you The third step (tracking the taint propagation) is usually handled by an existing DTA library, such as libdft, but most DTA libraries also provide ways for you to customize this step if you want Let’s go over these three steps and what each entails 10.2.1 Defining Taint Sources Taint sources are the program locations where you select the data that’s interesting to track For example, system calls, function entry points, or individual instructions can all be taint sources, as you’ll see shortly What data you choose to track depends on what you want to achieve with your DTA tool You can mark data as interesting by tainting it using API calls provided for that very purpose by the DTA library you’re using Typically, those API 266 Chapter 10 calls take a register or memory address to mark as tainted as the input For example, let’s say you want to track any data that comes in from the network to see whether it exhibits any behavior that could indicate an attack To that, you instrument network-related system calls like recv or recvfrom with a callback function that’s called by the dynamic instrumentation platform whenever these system calls occur In that callback function, you loop over all the received bytes and mark them as tainted In this example, the recv and recvfrom functions are your taint sources Similarly, if you’re interested in tracking data read from file, then you’d use system calls such as read as your taint source If you want to track numbers that are the product of two other numbers, you could taint the output operands of multiplication instructions, which are then your taint sources, and so on 10.2.2 Defining Taint Sinks Taint sinks are the program locations you check to see whether they can be influenced by tainted data For example, to detect control-flow hijacking attacks, you’d instrument indirect calls, indirect jumps, and return instructions with callbacks that check whether the targets of these instructions are influenced by tainted data These instrumented instructions would be your taint sinks DTA libraries provide functions that you can use to check whether a register or memory location is tainted Typically, when taint is detected at a taint sink, you’ll want to trigger some response, such as raising an alert 10.2.3 Tracking Taint Propagation As I mentioned, to track the flow of tainted data through a program, you need to instrument all instructions that handle data The instrumentation code determines how taint propagates from the input operands of an instruction to its output operands For instance, if the input operand of a mov instruction is tainted, the instrumentation code will mark the output operand as tainted as well, since it’s clearly influenced by the input operand In this way, tainted data may eventually propagate all the way from a taint source to a taint sink Tracking taint is a complicated process because determining which parts of an output operand to taint isn’t always trivial Taint propagation is subject to a taint policy that specifies the taint relationship between input and output operands As I’ll explain in Section 10.4, there are different taint policies you can use depending on your needs To save you the trouble of having to write instrumentation code for all instructions, taint propagation is typically handled by a dedicated DTA library, such as libdft Now that you understand how taint tracking works in general, let’s explore how you can use DTA to detect an information leak using a concrete example In Chapter 11, you’ll learn how to implement your own tool to detect just this kind of vulnerability! Principles of Dynamic Taint Analysis 267 10.3 Using DTA to Detect the Heartbleed Bug To see how DTA can be useful in practice, let’s consider how you can use it to detect the Heartbleed vulnerability in OpenSSL OpenSSL is a cryptographic library that’s widely used to protect communications on the Internet, including connections to websites and email servers Heartbleed can be abused to leak information from systems using a vulnerable version of OpenSSL This can include highly sensitive information, such as private keys and usernames/passwords stored in memory 10.3.1 A Brief Overview of the Heartbleed Vulnerability Heartbleed abuses a classic buffer overread in OpenSSL’s implementation of the Heartbeat protocol (note that Heartbeat is the name of the exploited protocol, while Heartbleed is the name of the exploit) The Heartbeat protocol allows devices to check whether the connection with an SSL-enabled server is still alive by sending the server a Heartbeat request containing an arbitrary character string specified by the sender If all is well, the server responds by echoing back that string in a Heartbeat response message In addition to the character string, the Heartbeat request contains a field specifying the length of that string It’s the incorrect handling of this length field that results in the Heartbleed vulnerability Vulnerable versions of OpenSSL allow an attacker to specify a length that’s much longer than the actual string, causing the server to leak additional bytes from memory when copying the string into the response Listing 10-1 shows the OpenSSL code responsible for the Heartbleed bug Let’s briefly discuss how it works and then go over how DTA can detect Heartbleed-related information leaks Listing 10-1: The code that caused the OpenSSL Heartbleed vulnerability /* Allocate memory for the response, size is bytes * message type, plus bytes payload length, plus * payload, plus padding */ ➊ buffer = OPENSSL_malloc(1 + + payload + padding); ➋ bp = buffer; ➌ ➍ ➎ /* Enter response type, length and copy payload */ *bp++ = TLS1_HB_RESPONSE; s2n(payload, bp); memcpy(bp, pl, payload); bp += payload; /* Random padding */ ➏ RAND_pseudo_bytes(bp, padding); ➐ r = ssl3_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, + payload + padding); 268 Chapter 10 Li sts Can Sav e You r Li fe Astronauts live by lists The safety checklists they use help make sure all systems are working before they entrust their lives to those systems For example, emergency checklists tell the astronauts what to in dire situations to prevent them from panicking Procedural checklists confirm that they’re using their equipment correctly so nothing breaks and prevents them from returning home These lists just might save their lives one day In this chapter, you’ll learn how to manage lists in Python and how to use them for checklists, maps, and almost anything in the universe When you build the Escape game, you’ll use lists to store information about the space station layout Making Your First List: The Take-Off Checklist Take-off is one of the most dangerous aspects of space travel When you’re strapped to a rocket, you want to double-check everything before it launches A simple checklist for take-off might contain the following steps: …… …… …… …… Put on suit Seal hatch Check cabin pressure Fasten seatbelt Python has the perfect way to store this information: the Python list is like a variable that stores multiple items As you’ll see, you can use it for numbers and text as well as a combination of both Let’s make a list in Python called take_off_checklist for our astronauts to use Because we’re just practicing with a short example, we’ll enter the code in the Python shell rather than creating a program (If you need a refresher on how to use the Python shell, see “Introducing the Python Shell” on page 15.) Enter the following in the IDLE shell, pressing enter at the end of each line to start a new line in the list: >>> take_off_checklist = ["Put on suit", "Seal hatch", "Check cabin pressure", "Fasten seatbelt"] Red Alert Make sure the brackets, quote marks, and commas in your code are precise If you get any errors, enter the list code again, and double-check that the brackets, quotes, and commas are in the correct places To avoid having to retype the code, use your mouse to highlight the text in the shell, right-click the text, select Copy, right-click again, and select Paste Let’s take a closer look at how the take_off_checklist list is made You mark the start of the list with an opening square bracket Python knows the list is not finished until it detects the final closing square bracket This means you can press enter at the end of each line to continue typing the instruction, and Python will know you’re not finished until you’ve given it the final bracket Quote marks tell Python that you’re giving it some text and where each piece of text starts and ends Each entry needs its own opening and closing quote marks You also need to separate the different pieces of text with commas The last entry doesn’t need a comma after it, because there isn’t another list item following it 34   Chapter Seeing Your List To see your checklist, you can use the print() function, as we did in Chapter 1 Add the name of your list to the print() function, like this: >>> print(take_off_checklist) ['Put on suit', 'Seal hatch', 'Check cabin pressure', 'Fasten seatbelt'] You don’t need quotes around take_off_checklist, because it’s the name of a list, not a piece of text If you put quotes around it, Python will just write the text take_off_checklist onscreen instead of giving you back your list Try it to see what happens Adding and Removing Items Even after you’ve created a list, you can add an item to it using the append() command The word append means to add something at the end (think of an appendix, at the end of a book) You use the append() command like this: >>> take_off_checklist.append("Tell Mission Control checks are complete") You enter the name of the list (without quote marks) followed by a period and the append() command, and then put the item to add in parentheses The item will be added to the end of the list, as you’ll see when you print the list again: >>> print(take_off_checklist) ['Put on suit', 'Seal hatch', 'Check cabin pressure', 'Fasten seatbelt', 'Tell Mission Control checks are complete'] You can also take items out of the list using the remove() command Let’s remove the Seal hatch item: >>> take_off_checklist.remove("Seal hatch") >>> print(take_off_checklist) ['Put on suit', 'Check cabin pressure', 'Fasten seatbelt', 'Tell Mission Control checks are complete'] Again, you enter the name of the list followed by a period and the remove() command, and then specify the item you want to remove inside the parentheses Red Alert When you’re removing an item from a list, make sure what you type matches the item exactly, including capital letters and any punctuation Otherwise, Python won’t recognize it and will give you an error Lists Can Save Your Life   35 Pro g r amm i n g a Ro b ot Lu mb e rjack It’s time to put your new programming knowledge to the test in the world of Minecraft We’ll program our first turtle to chop down all the wood blocks of a tree With the help of these turtles, your wood supply problems will be over! Chopping trees by hand in Minecraft has many problems It’s slow, it wears out your tools, and you need to reach the topmost wood block to completely chop down a tree In comparison, turtles can harvest a wood block in one chop, their tools don’t wear out, and they can hover as high as you need them to, as shown in Figure 6-1 Figure 6-1: Four turtles chopping a tall jungle tree Before we can write our tree-chopping program, you need to learn some additional turtle functions and you need to think about how the program will work Equipping Turtles with Tools To chop down trees, you need to equip the turtle with a brand-new diamond tool You can equip turtles with diamond pickaxes, shovels, axes, hoes, or swords, but an iron tool or a used diamond tool won’t work Fortunately, a tool’s durability will never decrease once a turtle is equipped with it To equip a turtle with a tool, place the tool in the turtle’s currently selected inventory slot, or current slot This is the inventory slot with the thick border around it Craft a diamond pickaxe and place it in the turtle’s current slot Run the Lua shell by entering the following: > lua Interactive Lua prompt Call exit() to exit Then, equip your turtle with the selected item by running this command: lua> turtle.equipLeft() Turtles can equip up to two tools: one on their left side and the other on their right If you want to unequip a turtle, just call the turtle.equipLeft() or turtle.equipRight() function with nothing in the currently selected slot The turtle will remove the tool and put it in its inventory Turtles can equip any diamond tool, but the diamond pickaxe is the most versatile The diamond shovel can mine dirt blocks and the diamond axe can mine wood blocks, but neither can mine stone or ore blocks The diamond pickaxe can mine all types of blocks, so we’ll use it for all the turtles in this book With the pickaxe equipped, the turtle can call the turtle.dig() function, which I’ll explain in the next section, to mine blocks or chop wood 60   Chapter Designing a Tree-Chopping Algorithm Before we write code, let’s thoroughly think through what the lumberjack turtle needs to By planning ahead of time, you’ll spot mistakes in your program early instead of discovering them only after you’ve written it As the old carpenter saying goes, “measure twice; cut once.” We will be planning the turtle’s tree-chopping algorithm An algorithm is a series of steps for a computer to follow to solve a problem To chop down a tree, we’ll start the turtle at the base, dig, move forward, dig above the turtle, move up, and then repeat the last two steps for the whole tree When the turtle is done, it will move back to the ground so it can be picked up Figures 6-2 to 6-6 show this entire process Figure 6-2: The turtle starts at the bottom of the tree, facing the bottom wood block Figure 6-3: The turtle chops the bottom wood block, and then moves forward so it is under the tree Figure 6-5: The turtle keeps chopping up until there are no more wood blocks above it Figure 6-4: The turtle chops upward, and then moves up one space Figure 6-6: The turtle moves back down to the ground so the player can pick it up The leaves will decompose Programming a Robot Lumberjack   61 Founded in 1994, No Starch Press is one of the few remaining independent technical book publishers We publish the finest in geek entertainment—unique books on technology, with a focus on open source, security, hacking, programming, alternative operating systems, and LEGO Our titles have personality, our authors are passionate, and our books tackle topics that people care about VISIT WWW.NOSTARCH COM FOR A COMPLETE CATALOG N o Sta rc h Press 018 C ata log fo r H u m ble Boo k Bu n d le : L e a rn yo u so m e c o d e Cop y ri g h t © 018 N o Sta rc h Press , In c All ri g h t s rese rv e d im pr act i c a l p y t h on pro ject s © lee vau g h a n se ri o u s p y t h on © ju l ien da n j o u m at h a d v en t u res w i t h p y t h on © pe t e r fa rrell pr act i c a l s q l © a n t h on y d e b a rros t h e ru s t p rog r a m m i ng l a ng ua ge © Moz i ll a Co rpo r at i on a n d t h e Ru s t P ro jec t De v elope r s p r a c t i c a l bi n a ry a n a ly s i s © d enn i s a n d ri esse m i ss i on p y t h on © se a n m c m a n u s c o d i ng w i t h m i nec r a f t © a l swe i g a rt N o S ta rc h P ress a n d t h e N o S ta rc h P ress logo a re reg i s t e re d t r a d e m a rk s of N o S ta rc h P ress , In c N o pa rt of t h i s wo rk m ay b e rep ro d u c e d o r t r a ns m i t t e d i n a n y fo rm o r by a n y m e a ns , elec t ron i c o r m ec h a n i c a l , i n c lu d i ng p h oto c op y i ng , rec o rd i ng , o r by a n y i nfo rm at i on s to r a ge o r re t ri e va l s y s t e m , w i t h o u t t h e p ri o r w ri t t en pe rm i ss i on of N o S ta rc h P ress , In c ... with all the examples in this book, is available for download via the resources at https://www .nostarch. com/practicalSQL/ CREATE DATABASE analysis; Listing 1-1: Creating a database named analysis
- Xem thêm -

Xem thêm: Hacking ebook nostarch sampler learnyousomecode , Hacking ebook nostarch sampler learnyousomecode