By definition all scientists are data scientists. In my opinion, they are half hacker, half analyst, they use data to build products and find insights.Monica Rogati, VP for Data, Jawbone
Today’s post is part of a new effort to bring back some of the most successful context from turtlshel.org’s long history. These posts will cover the full range of past TS topics, including Japanese language learning resources, geography, Esperanto, international futbol, NERD stuff, bread baking, fly tying, and data science stuff. The original article was published on turtlshel.org in 2005.
The area of a surface with square corners and straight edges on a map or photo can be found by multiplying the length of the surface by its width and then converting to real-world units using the scale of the image. What do you do if you need to find the area of an organic shape, or of a very complex geometric shape? One method that is widely used is the dot count using a dot grid.
A dot grid is a transparent sheet printed or drawn with dots arranged in a regular and even pattern such as a grid. When the dot grid is calibrated to the scale of the map or photo you are studying (finding the number of dots that falls in a known area), the area of an unknown surface can be found by laying the dot grid over the area, counting the number of dots that fall in and on the surface, and dividing that number by the number of calibrated dots per unit area. This gives you the area of the surface you are estimating in the units of your calibration.
As an example, let us suppose that I have calibrated my dot grid on an aerial photograph using a farmer’s field, bounded by section-line roads, as my known distance. How do I know the area of the farmer’s field? Often, fields are laid out along the US Public Land Survey System, with roads following the 1-mile edges of sections. A field bounded by such roads would be 1 square mile. These human features are obvious in aerial photos and on maps, and are very useful for establishing scale and for calibrating dot and square grids. Calibrating against the 1-square-mile field, suppose that I find that my dot grid is a size that there are 225 dots per square mile.
Now suppose that I have to find the surface area of a lake on the same photograph as the 1-square-mile field. Using the dot grid that I have just calibrated, I cover the lake with the dot grid and count 675 dots on the lake. The number of dots in my area-for-estimation (675), divided by my number of dots per unit area (225 dots per mile square) gives the lake an area of 3 square miles.
There are a few key points to using the dot grid:
- A dot count is a statistical method. It is important that you don’t line up the grid to get the best fit to count in your object. The whole point of the dot count is to see how many dots randomly fall within the area when the dot grid is placed in a random relationship to the area.
- When you are counting dots, each dot that falls completely within the area is given the weight of 1 full dot. Any dot that touches the side of the object, whether it is inside, outside, or one the line, gets a weight of 1 half dot. The number of whole dots plus the number of half dots (or the number of half dots divided by two, actually) is the total number of dots to be used in estimating the area of a surface.
- Once you have begun counting dots, don’t move the dot grid. If you do accidental move the grid, don’t just keep counting. You have to start over from the beginning.
I have created a dot grid for you to use in trying this method out. To use the dot grid, click on the image above to download “dotgrid1.pdf.” This file needs to be printed on plastic transparency, which is available for inkjet and laser printers, as well as copy machines for between $0.15 and $0.75.
Big Data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.Dan Ariely, Duke University
Given my new job, I spend a lot of time thinking about equity (the people kind, not the investment kind), how we define it (in a formal, first-order logic sort of way), and how we measure it in the real world. As an autistic, gender-queer person, I spend more time thinking about visibility and representation than I ever thought I would. Those two things came together today while I was tagging along on a photoshoot (one that wasn’t mine, for a change) and I saw the sticker pictured above in the featured post image.
The sticker, and its quote, stuck with me (pun wasn’t intended, although it’s rather good and I’m inclined to take credit anyway) – and for good reason. Both the quote and the posterized image are from the first interview with Chelsea Manning, the US Army Intelligence Analyst who was convicted in 2013 under the Espionage Act after leaking classified documents to WikiLeaks*, after her release from prison under a presidential
pardon commutation of sentence in 2017.
I couldn’t find a transcript of the interview, which was hosted by the Institute of Contemporary Arts at the Royal Institution in London, and which was held as a conversation with writer James Bridle, but FRIEZE has a good summary (including the photo from the sticker), and video of the entire conversation is on YouTube. The quote from the sticker is just one finely-cut gemstone set in the middle of a very strong paragraph, placed in the middle of an interview full of powerful ideas and insight:
‘There’s a misunderstanding: visibility is not the same thing as equality. As a trans woman, I know that there is a systemic problem that can only be addressed when we challenge the core assumptions in society.’Chelsea Manning, 1 October 2018
Ms Manning argues that with technology, visibility can make injustice and inequality even worse, saying that if you were already more likely to be arrested, algorithms make that 1 million times worse. There is so much packed into those words, and into the conversation around it. But the thing from the interview that I took away most (beyond the original quote from this afternoon that started me down this rabbit hole, and the part that is likely the most applicable to my current work, was a call on algorithm developers to emulate the scientists who refused to work on the Manhattan Project and practice an ethical form of refusal.
An interesting note: I didn’t know who the quote was by, or who the woman on the sticker was when I saw it, but the photoshoot I was tagging along on was for a trans friend of ours and his boyfriend. Seems fitting.
*I am not at all interested in debating the morality of what Ms Manning did (which I, frankly, agree with), or the legality of it (which a vindictive and extremely embarrassed military court found to be illegal) – neither of which are material to this post.
I have been making my way back through Al Sweigart’s “Cracking Codes With Python” from No Starch Press. Although the title is different, it’s actually the second edition of his earlier book “Hacking Secret Ciphers with Python,” which is available at the link as a free PDF, but which I originally studied from as a tattered paper printout.
I had forgotten, first, how much fun it is to just play around in Python and do things that aren’t strictly related to data analysis (although I suppose it really is just another branch of analysis), and, second, how much I enjoy integrating the things I learned years ago from and am no revisiting in this book years ago with all the neat tricks I’ve learned in Python since. Things such as pulling strings from a MariaDB table, encrypting them, and then writing them back to the table, or the module that I have been throwing all of the cipher functions I have modified from the code in the book (still maintaining the BSD License, of course) into a little stand-alone python program. It reminds me so much of the little script programs I wrote as a kid in BASIC, trying to mimic the interactive computer AI that I so dearly wished was a real thing (and which is now a daily reality interacting with my phone an computer through my hearing aids).
A bit of fun output from one of the ciphers:
n5Vz VE74VA4.A 4V8?VE74VH.C 3,VJ.FVzC4V.?4V.5VE74!YVs78DV8D,V7.H4G4C,VzVG4CJV zC64VzDDF!AE8.?VE7zEVEz04DVzDV68G4?VzVD8?6 4VE8!4 8?4Vz?3VDAz24V.5V4I8DE4?24YVm48E74CV2z?VH4VAC.G4VE7zEVJ.FVzC4V 8!8E43VE.VD8?6 4V8D?Ez?2EVz?3V4?E8EJ,V?.CV2z?VH4V34!.?DECzE4V2.!A 4E4 JVE7zEVJ.FVzC4V4I8DE8?6V8?VE74VDz!4VE8!4 8?4/A.8?E-.5-DAz24V2.!18?zE8.?VE7zEVH4VzC4V8?7z18E8?6YVx.FVzC4V48E74CVG4CJVC4z V.CVD.!4V346C44V.5V?.E-BF8E4-D.-C4z Y
There isn’t really a point in my post, except to share how much I am enjoying revisiting this content after having grown as a programmer. If you are a beginner to intermediate Python user, or even an advanced programmer with an interest but little experience in ciphers, this book is definitely for you.
The Turtlshel Project is back, yet again, for another go at serving up a mix of nerdy, sciency, weird, and wonderful information from my brain and from around the Web. This site has gone through so many iterations, and so many varied topics of focus in the little more than 20 years since I first registered the domain. So, here we go again.