If you are reading this, you’re a coder and use functions. We write them for ourselves. If someone else writes a function, you can hope it works. If it doesn’t, you can hope to fix it. Hopefully, the return value is obviously correct. But maybe it’s subtly wrong?
If things are amiss, read the name of the function and hope it’s descriptive. I worked with a programmer who omitted all vowels from his function names. So the above code would expand to this…
R is a programming language for statistical computing and data visualization. It has been adopted in the fields of data mining, bioinformatics, and data analysis.
We assume R isn’t useful for flipping switches and reading sensors. But that’s an assumption, not a fact.
Let’s be honest. Computer languages are just an abstraction layer on top of the metal. Languages provide constructs to make the expression of logic easier, tailored for the different ways people approach logic and data. But all languages eventually drive the behavior of transistors, and the logic gates and microcircuits built on top of those transistors.
We assume R doesn’t care about transistors. Possibly true, but I don’t believe that’s set in stone. In fact, I’ve proved it’s not true.
So Steve (SQL), Marsha (C), Bob (Python), and I (R) are at this party. We have TOTALLY cleared the room, especially now that Steve and I are deep into a debate about saving native data objects to disk versus storing data in a database.
I see Monica enter from the kitchen, carrying a bowl full of punch. It’s an awkward task and the fruity, sticky liquid is sloshing on the floor. Monica does data science, so I’m hoping she’ll come to my assist. Sure enough, she places the punch bowl on the table and joins us. She’s about to say something when the front door swings open.
Guenter walks in; he just got off a plane from Germany, so he looks a bit jet-lagged. Since the room is filled with a bunch of people talking SQL, he assumes database debates are the theme of the party.
“I think I have already written an article in this context,” Guenter begins.
Before he can say anything more, Helen speaks up. “Perhaps talking about programming is an attempt to get everyone to leave the house at the end of the party so you can go to bed?” Where Helen appeared from is a mystery.
Monica listens for a minute, then interrupts the pointless debate between Steve and I. “People who are math aficionados” she says, “are a lot more comfortable generating datasets on-the-fly. People like me enjoy relying on the safety and reliability of importing a structured dataset we checked earlier!”
Steve is happy to hear someone is on his side. Steve thinks I’m a knucklehead. There are many people who agree.
“Sure, but there are advantages to not messing around with unnecessary overhead,” I say. “Let’s play with an example.”
I get out a new napkin and sketch out some R code…
“Eeee-VAP-oooo-TRANS-PURR-ation,” I savor the word as I release it into our conversation. I’m still at the party with Marsha and Bob. We’re trying to determine why anyone (such as me) would want to use R on their Raspberry Pi.
“Big word,” says Bob. “What’s it mean?”
“Water evaporation from the earth and transpiration from plants,” I respond. “It’s a sum of the water escaping from my irrigation system. Look it up on Wikipedia.”
Marsha interrupts grumpy Bob; “So – That means, um…desired amount of water – rainfall + evapotranspiration equals the amount of water your irrigation system needs to supply.”
“Precisely,” I agree. “Until I found out about evapotranspiration, I was unsure how to account for temperature. I knew hot days would require more water because of increased evaporation; but was stumped how to translate temperature into increased inches of necessary water.”
“Never heard of it,” says Bob.
“Me neither,” I agree. “Evapotranspiration is handy, but doesn’t show up in all weather forecasts. Open-Meteo makes it available.”
“Say you’ve got seven days worth of this miracle number,” says Bob. “What does the R code look like?”
I’m at this party where Bob and Marsha and I are discussing the best languages for programming a Raspberry Pi. Bob advocates for Python, Marsha is a devout student of C. I’m defending my use of R. After all, Raspberry Pi starts with R. We have chased all the other guests out of the room with our conversation.
“With R, I have all sorts of built-in data management,” I say. “Manipulating matrices is in R’s basic DNA.”
Steve wanders in from the other room and joins our conversation. “Matrices aren’t a proper data strategy. You should be using a database. You can run SQLite on a Raspberry Pi with hardly any effort.”
Bob and Marsha simultaneously turn to stare me down. They are curious about how I’m going to get around this supposition.
“Sure. SQL with R–in particular SQLite, would have been easy to implement,” I pontificate. “Just call up RSQLite, push a few buttons, and Bob’s Your Uncle.”
“And that’s not what you did?” Steve is incredulous.
“I store the R object on disk and pull it into memory when I need it.”
“What kind of knucklehead stores data as a file on disk?”
I’m at a party, and the topic of programming languages comes up. A quarter of the room politely leaves, another half rudely leaves, and the remaining three people banter about the proper language for a certain project. Bob, Marsha, and I have the room to ourselves.
Tonight we’re talking microcontrollers: Arduino or Raspberry Pi. We like to connect things and aren’t afraid to release the magic white smoke with an ill-advised application of excess voltage. Bob assumes programming is done with micropython. Marsha prefers C.
“I use R to water my garden,” I say. “Yep, I installed R on a Raspberry Pi and use it to turn the water valves on and off.”
Human languages are notoriously ambiguous. Computer languages are notoriously un-ambiguous. Humans (mostly) are comfortable with uncertainty. Computers don’t even believe uncertainty is possible. It’s what led us to create un-ambiguous languages specifically for computers.
One morning, I shot an elephant in my pajamas. How he got in my pajamas I don’t know.”
My new car displays the highway speed limit. It’s a small reproduction of a speed limit sign, located on the upper left corner of the display. Which isn’t such a big deal, considering map data has included speed limits for the last ten years.
But I had a surprise when I was driving on a rural back road. It was twisty and in the middle of the forest. I was preoccupied with trying to avoid the deer springing out of the ditches, but noticed the speed limit had changed to eighty-five miles per hour. What?
You sent me the wrong version of the Excaliber Coffee Pot. Your company constantly makes mistakes with shipping. I hate doing business with you!
I just wanted to tell you I received my Excaliber Coffee Pot today. It is exactly what I expected and needed. I can’t thank you enough!
Consider these two pieces of feedback from customers, sent via the contact form on a company’s website. One is from an unhappy customer – and one is from a satisfied customer. Both are regarding shipping the Excaliber Coffee Pot. An astute product manager (we’ll call her “Sarah”) would follow up: why is this customer happy (or sad)? Would this customer recommend this product to (or warn) their friends? Would this customer shop with us again?
Let’s assume you’re reading this for one of three reasons:
You have experience with R, but not NLP
You have experience with NLP, but not R
You have no idea what this is all about, but someone said you need this for some reason. (Perhaps a thesis advisor? A data scientist? A trendy article?)
Let’s start with two brief explanations you can use to orient yourself in this new world.