Lenny's Quest

I remember many years ago when I was a wee lad of ten, my father and I were traveling home one day. As I sat in the car next to him he began to speak. The tone he used was one I knew so well. It had a meaning. That meaning: a joke was incoming.

So I listened, enthralled in the details, wondering when the punch line would come out and smack me with a well timed hit. After a minute or two of back story, my dad stopped. He looked at me and said, “Sorry, buddy, I feel like God is telling me not tell you this joke.”

Now, telling a ten-year-old they can't have something is a sure fire way to make them want that thing and sure enough I wanted that joke, no, I needed that joke.

Dad was resolute and did not let a single detail slip no matter how I worded the request.

Years passed, we moved country, I graduated high-school, I left home, I got married, I started balding, I had kids, I got a degree. Over two decades had passed since dad had dangled that joke in front of my face and snatched it away.

Then one fateful day I was at a function with a lot of family and friends were there, including my father. I was talking to another group of people when out the corner of my ear I heard a passage I had not heard in twenty years. Just a few words and my brain made the instant connection to that car ride that took place on the other side of the world.

I slowly moved closer, waiting for the closure that was denied my younger self. A million thoughts raced through my mind. Was he going to stop telling them too? Would he see me and stop telling it? How could I be sure it was, in fact, the same joke? I knew each thought was irrational.

I listened, and suddenly, without warning, there it was, the punchline. After twenty years of waiting I finally knew how the joke ended. All thoughts in my head ceased, and I was left with but one: That was a fucking shit joke.

#life #sliceoflife #joke #jokes


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

I dream to be a writer the same way many people dream to be actors, I would love to do it but I know the chance of success is incredibly small. Even still I dabble from time to time and try my hand at the odd prose here and there.

There are two sayings that I'll be focusing on here, and the first is:

My advice to the young writer is likely to be unpalatable in an age of instant successes and meteoric falls. I tell the neophyte: Write a million words–the absolute best you can write, then throw it all away and bravely turn your back on what you have written. – David Eddings

I've heard this before but hadn't focused on this little part here:

the absolute best you can write

I half-heard the quote and thought: A million words? Yeah I can do that. If I do four stories of 50,000 words a year then I can be done in five years!

I don't know what prompted me to reread the quote but then I saw it. The absolute best you can write. This wasn't just a numbers game, it was about solid effort too. I imagine if I did write out a million words then I would indeed get a bit better. But if I wanted to really excel then I needed to up my game.

I don't know how long this will take, which brings me nicely to the next saying I want to focus on:

The best time to plant a tree is twenty years ago. The second best time is now. – Chinese Proverb

While this is obviously overused today I feel it still tackles part of my issue I have with long term practice. I am thirty-four years old. There have been countless people who were older than me when they made their debut as professional authors, but it is still hard not to compare myself to those the same age who have already made it, or worse, those younger than me.

I look at those that are successful and the self-doubt creeps in. They started younger. They have a talent you don't. They have more practice. They made their name before the flood of GPT generated stories drowned the market.

Well that doesn't matter and I still want to be an author. Even if I never sell a book I think that would be better than not trying at all. To this end I will state publicly my life goal here:

I will be a full-time author ten years from now. – Lenny, 2023

So, here I go. This is me starting from word zero. My current goal is to write a fifty-thousand word novel of my best work in two months. Similar to a NaNoWriMo challenge but at the end I should have a polished story.

P.S. Got any good writing book recommendations? I'm currently reading “Write Worlds Your Readers Won't Forget” by Stant Litore and I think it's amazing!

#writing #gpt #stories #story #thoughts #challenge


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

Note: These views are all my own and do not reflect that of my employer

Intro

This week for work I had to compare some of our offerings to the offerings of other companies that offer the same offerings (any more offers?). This required me to do two things: scrape the listings of other companies and then compare it to our offerings.

While I won't be giving any code examples in this post, I will say that I used iPython (or is it Jupyter these days...) notebooks to do all the work. I found it useful because I could document my progress as I went and easily run only small blocks of code. After everything is working I plan to move it all to plain Python scripts that can easily be run on a server.

What I did

Scraping was straight forward but time consuming. I used some tricks I learnt from reading Automate the Boring Stuff by Al Sweigart to scrape the pages. To be fair, that book is pretty old by now and I haven't looked for alternatives so maybe there is something better these days rather than requests and BeautifulSoup4.

Now that I had the data the next step was to match product to product. The problem is that product names are slightly different between companies, but the good thing is each product has a fairly lengthy description. After a bit of querying the internet I discovered it was possible to do a semantic search on text using embeddings.

Now, I have only half-assed searched this so I'm not the best person to try learn this from, but my understanding is embedding data is simply a way to store data in a format. In this case we are storing the semantic meaning of text in a vector format. The way we get the semantic meaning is by using a pre-trained model that is good at this.

Looking through the OpenAI API pricing I saw that their embedding model was really cheap. I estimated it would cost me about $0.25USD to get the embeddings for every product description I had.

And I was right! I now had a few thousand embeddings for product descriptions. Now the thing about embeddings is that they are vectors, and the more similar a semantic meaning the closer in angle a vector will be to each other. For instance, if I had was selling an architect course and you were selling a building design course then semantically the vectors should be very close in angle. However, the semantic vector for my architect course and the semantic vector for your chemistry course would probably have very different angles.

Now, the vector's I got back from the OpenAI embedding API were 1031-dimension vectors. Cool thing is they still have angles between them, even if visualizing it is not something human's can do.

I iterated over every product we offer and compared it to every other product offered by our competitors using spatial.distance.cosine from the scipy package. When I found the smallest cosine value that indicated the best match.

Note: It's a bit weird that it was the smallest, a lot of documentation I found (for instance, here at Google) seemed to indicate that a bigger number should correlate to a better match. ¯_(ツ)_/¯

Outcome

Looking at the results I was mostly impressed. A lot of our products matched perfectly with other products. Honestly this kinda blew my mind the fact that I had taken descriptions of products, got the semantic meaning in a vector, and then had been able to match on that. (Part of the reason I am writing this post is I don't think my coworkers can take me gushing about it anymore and I still want to gush.)

It wasn't all sunshine and rainbows though, there were a few products that CLEARLY didn't match with what had been shown. My problem is I don't know the material deep enough to begin to diagnose where the issue is. Are the descriptions too similar? Is my assumptions wrong? Did I store the wrong vector against the wrong entry? I guess I'll need to research some more.


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

Following some advice I got over on Hacker News I am going to start posting little Today I Learned posts as a way to help me ease into more consistent blog writing.

The Code

-- Note, I run this on Oracle DB
CREATE TABLE NEW_TABLE AS
SELECT *
    FROM OLD_TABLE
    WHERE 1 = 0
;

What this achieves and why

This will create a new table called NEW_TABLE that has the exact same schema as OLD_TABLE. (Note, I don't know if EVERYTHING is the same all the way down, but the schema is the same which is what I care about.)

I generally use this when I have external data coming from a spreadsheet, or an SQL script and I want to insert it into my OLD_TABLE, but first I need to perform some alterations to the data.

Before now I tended to copy the DDL from the source table and run it. This is much quicker!

Tags

#til #sql #oraclesql


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

One of my all time goals has been to learn Japanese. However, every time I began the quest I would get dismayed about the long journey ahead of me. This would bring about feelings of frustration and I would inevitably quit. It has been over a decade since I first attempted to learn and had a stuck with it I would be extremely proficient by now. Just think of all the shows and manga I could watch and read in their native Japanese language.

But, as they say, the best time to plant a tree is yesterday, the second best time is now. So I am learning again. I have found a YouTuber called ToKini Andy who has a lot of guides on how to learn Japanese. I am quite fond of his teaching style so I'll probably follow close to his instructions.

So far my plan of attack (which was shamelessly stolen from one of his videos) for the first bit is like so:

  1. Learn Hiragana and Katakana together
  2. Buy 1000 Essential Vocabulary for the JLPT N5
  3. Buy the Genki workbook and textbook, then watch This video by ToKini Andy
  4. Buy Remembering the Kanji

Points 2, 3 and 4 will most likely happen at the same time. I have luckily managed to find them all on sites that aren't amazon. I despise amazon and prefer not to use it whenever I have the chance.

According to ToKini Andy this will take me about a year. I also plan to bust out the old 3DS and grab a few language learning games from there. Apparently there is a lot.

If you have any advice let me know! Or even just tell me how your learning journey went! Or even just say おはよう!

#Japanese #Learning #JapaneseLanguage #Noob #Quest


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

So today I worked on making a small API just for funsies. It allows you to roll dice or draw a card from a deck. It was a thought I had while thinking of a sole TTRPG called Colostle. In Colostle you draw cards from a deck, and based on the card you get events that way. I thought it would be cool to have a deck of cards that I can easily come back to later and have in the same order. And I figured why not start with dice instead?

The dice part was pretty easy, simply parse the string into dice. So for instance hitting http://web.site/die/1d3 4d6 2d20 would roll one D3, four D6's and two D20's. The rolling is done via Python's builtin random.randint function. When done I total up all the die for each category (e.g., I total the 4D6) and I return them in JSON format.

Here's an example output from that call:

{
  "1d3": {
    "1": 3,
    "total": 3
  },
  "2d20": {
    "1": 4,
    "2": 7,
    "total": 11
  },
  "4d6": {
    "1": 4,
    "2": 5,
    "3": 2,
    "4": 6,
    "total": 17
  }
}

The deck part was a little trickier. I needed a way to store a shuffled deck and then recall it later. I ended up using SQLite to store the decks. I have table called DECKS that has the following DDL:

CREATE TABLE DECKS (DECK_NAME VARCHAR, CARD VARCHAR, DECK_ORDER INTEGER);

So when I put in a deck of cards, I add the name of the deck and the order in which they'll be pulled. Then to get the next card is as simple as:

SELECT CARD 
    FROM DECKS
    WHERE NAME = 'Some-Name-Here'
    ORDER BY DECK_ORDER
    LIMIT 1
;

Once I draw a card I run another SQL command that will delete that row from the table (this command is left as an exercise to the reader). The selecting of a card along with deleting it after gives the effect of drawing a card from the deck. And because I'm using SQLite it doesn't matter how long later I come back to this it should still be there. (Although, if other people use this (which I doubt) then I will have to have some way of culling it as to not fill up the storage space (but I really doubt that will ever happen) )

What about the names? Where do they come from?

I appreciate you conveniently asking me that. I have a big 'ol list of words in the database. I got it from some repo with an MIT license and imported them as part of the setup. Then I grab three random words and join them together with hyphens.

On a side note, be careful where you pull lists from. This one had several racial slurs. It was only after working with the list for awhile that I figured I should check on a hunch. Glad I did. Does anyone know where to get a list of words that don't have racial slurs? There are so many words that I'm sure I missed some.

Anyway that's the long and short of it. I haven't actually got an accessible end point set up yet, but when I do I'll post the link. If you wanted to judge me harshly read my code then you can do so >>here<<

I'm hoping to make other utilities over time and have the API turn in to some cyber swiss army knife!

#API #dice #cards #random #TTRPG


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

The Issue

I want to have a #blog. Why? Maybe part of me wants something for the world to remember me by. Maybe I want to prove to everyone else that I do in fact produce stuff of value. Maybe it's because I am vane and love hearing myself talk.

However, if you have looked around my meager blog you will as of now find only one lonely post, and that post is actually an old post I wrote over on Medium. It feels like that one guy in your town who peaked in highschool but still has his highschool footy trophies on display in the living room.

Sooo... What is holding me back? Where's the writing? Why the hell did I pay for this domain to only post one article?!?!

Well, this is me, publicly saying stuff, and part of me feels like the words have to be perfect, to have meaning. You never know when a future (or current) employer is going to find this stuff. I need to look PROFESSIONAL!

Yet, I constantly tell everyone I know that my job, is just that, a job. I don't find my life meaning in my work. I will do it to the best of my abilities, but when I'm laying on my death bed I don't want to hold up the corporate ladder I climbed and have that as my proudest achievement.

I want to make my value in life doing hobbies, experiencing new things, spending time with my family. Seems a bit silly to waste a blog on being PROFESSIONAL when I don't even consider it one of my core values.

So What Now?

I read an article by internetvin? titled I don't like making the best things. In this they talk about how they only wanted to publish the best, so they found they weren't publishing at all. Also, trying to make the best things left the paralyzed and unhappy.

I can totally empathize. In fact, this isn't even the first time I've written about this. I wrote a short post on my old BearBlog account (Medium, BearBlog, yeah I know, I get around) about how I don't want to be the best. I'd link it but one time I got super depressed and deleted a whole lotta my online presence (I'm much better now thank you very much for asking).

So! No more waiting for the best inspiration, the best project, the best whatever. I'm gonna post lots of stuff. Stuff ranging from things that have grabbed my interest to cool things I did that day to everything in between. Who knows what I may feel like sharing with you all out there (you are all out there, right?).

Let's start with three facts: 1. I have a #BorderCollie named #Kenobi 2. It is a terrible idea to make your dog give you a kiss before they're allowed to eat, too much drool 3. Today I reached my highest speed on my #fixie (fixed wheel #bicycle): 39.9km/h! Pretty scary when you're wearing just thongs (flip flops).

tl;dr: Gonna write more.


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

Note: I originally wrote this article back in the like 2019. At the time I was interested in showing people how much Google actually tracked a person. Unfortunately, my drive died pretty shortly after this article.


Last week I was looking through all my personal data I could export with Google Takeout. One of the things I noticed was they had the option to export your Google Pay data. For the uninitiated, Google Pay lets you use your smartphone as a Pay-Wave credit card replacement. This makes it more convenient for you to spend your hard earned dollars on those things you really don't need. Unable to contain my curiosity about my spending habits I promptly downloaded the history.

Now, for some reason Google has decided the only format you are allowed to download Google Pay data in is… HTML… 'Cuz, you know, HTML is best for large data sets /s. Anyway, I opened up the HTML page and the very first thing I noticed was that each purchase had a link attached to it that pointed to Google Maps! It was at this moment I knew I wanted to make a heat-map of my purchases.


Let's get started! First things first, download your payments from Google. Go to https://takeout.google.com. Once there click “Manage Archives” and then “Create New Archive”. For our purposes we are only interested in your Google Pay data. So make sure all options are unticked except for Google Pay. Click “Next” then click “Create Archive”. When you're archive is ready you will receive an email.

When it's ready, download your archive and get the file in the zip file located at Takeout > Google Pay > My Activity > My Activity.html. This is the file we will be scraping the information from. Save it to an empty folder somewhere. We'll use this folder for both the code and storing the input and output. Go ahead and look through the HTML file to get a feel for the layout.

The first thing I wanted to do was strip the out the following items from each purchase and save it to a CSV file: amount, date, time, latitude and longitude. To begin with I inspected the HTML file with Firefox. I found that each payment entry was surrounded by a div with a class called “mdl-grid”.

From there I was able to work out the div for date and time, price, and latitude and longitude. Now, on to the Python!

For this project I'm using Python 3.6, but I feel any Python 3 version should work (don't quote me). First thing we need to do is install beautiful soup. If you don't know what it is, Beautiful Soup is a super handy tool for looking through HTML files (online or offline). It makes it really easy to search for an element based on type, class or ID. If you want to use a virtual environment go ahead and activate it now. To install beautiful soup do:

pip install bs4

Now make a new file called scrapePayments.py. Bellow is the code I used to scrape the values to a CSV file:

import bs4, sys, csv

LOCAL_CURRENCY_SYMBOL = '$'

def main():
    
    # Import html file to beautiful soup
    with open(sys.argv[1], 'r', encoding='utf8') as file:
        html = bs4.BeautifulSoup(file, "html.parser")

    # find tags that have payment details
    payment_html = html.select('.mdl-grid')
 

    # '.mdl-grid' also grabs the entire page as it the whole 
    # webpage is wrapped in a 'mdl-grid' div. This just removes it

    payments = []
    # This is the part that grabs the data out of each element.
    for payment in payment_html:
        print("============================================")
        print(payment)
        try:
            payment_details = extract_purchase_details(payment)
            payments.append(payment_details)
        except:
            # This is a hacky way to remove elements that don't fit properly.
            # If an payment element throws an exception then instead of handling it we just ignore it.
            # This helps remove "payments" that are things like special promotions and such
            pass

    write_to_csv(payments)

def extract_purchase_details(payment):
    # This is responsible for extracting the payment details
    date, time = (payment
                    .select('.mdl-typography--body-1')[0]
                    .get_text()[57:]
                    .split(',')
    )
    time = time.strip() # This removes the whitespace infront of time    

    price = (payment.select('.mdl-typography--caption')[0]
                        .get_text()
                        .split("\u2003")[3]
                        .split(LOCAL_CURRENCY_SYMBOL)[1][:5]
                        .replace('W', '')
    )
    
    maps_url = (payment
            .select('.mdl-typography--caption')[0]
            .find('a')['href']
    )
    query_index = maps_url.find("query=")
    lat_and_lon = maps_url[query_index+6:]
    lat, lon = lat_and_lon.split(',')
    lat = lat.strip()
    lon = lon.strip()
    
    return (price, date, time, lat, lon)

def write_to_csv(payments):
    # Writes the values of payments to a csv file.
    with open('output.csv', 'w', newline='') as f:
        writer = csv.writer(f)
        # Write the headers for the CSV file.
        writer.writerow(['amount', 'date', 'time', 'lat', 'lon'])
        # Write the array of tuples to the CSV file
        writer.writerows(payments)


if __name__ == "__main__":
    main()

To begin with, this code reads the HTML file into a Beautiful Soup object. Then from that object we select the items that relate to payment (every object that has the class 'mdl-grid'). The way we've chosen to select the items gives us an extra element we don't want, so we just pop it off the list.

After that we go through each payment and feed each one through our 'extractpurchasedetails' function. This function will go through and extract all the information from the HTML elements. The date and time, for example, is pulled from the text element with a class 'mdl-typography - body-1'. Now there is actually two elements with this class, so it actually returns a list with two elements. We just take the first one. After that we use a string slice to remove the excess text from the values, this leave us with text like the following:

Attempted contactless payment<br>8 Jan 2019, 20:47:30 AEDT

To remove the excess we use a string slice which removes everything except the date and time. Then we split the string at the comma and save the date to the 'date' variable and the time to the 'time' variable. The rest of the values are done in a similar method. Once we have all the values the function returns the values as a tuple which we store in 'payment_details'. Finally we append this tuple to our payments list.

This is all wrapped in a try-catch statement. This is a little bit of a cheat to get rid of the entries that don't actually have purchases in them (things like promotions). Because they don't have the same layout we will cause an exception when trying to access elements of the purchase that don't exist. Instead of handling the exception, we're just ignoring it (just like how I handle all my real life problems!)

Once all values are scraped, we then use Python's CSV library to write the values to a CSV file. To run this, do the following command:

python scrapePayments.py 'My Activity.html'

Now you have all your purchases in a nice CSV file. Depending on your currency you may need to adjust line 3 where we define the local currency symbol '$':

LOCAL_CURRENCY_SYMBOL = '$'
# Might need to become
LOCAL_CURRENCY_SYMBOL ='£'

Funny story, before I added that line to split on the dollar sign, I found out that last time I went to the casino someone charged me 18 Indonesian —Rupees— Rupiah*! 😲 (I live in Australia, that's like $0.0018 AUD at the current exchange rate!) *Thanks to @yogasukmap for pointing out that they use Rupiah, not Rupees in Indonesia!


Alright, time to heat-map! To be perfectly honest, for this part I followed a guide written by a guy called Mike Cunha over on his website. The blog post can be found –> here <–.

EDIT: Update from 2023 again, I just checked this page and the certificate has expired. Here's a link from the wayback machine, the formatting is missing but you can still get the info.

We don't need to follow his method exactly as he adds a boundary to his map. So for our purposes you'll need to install only the following python modules:

pip install folium pandas

Pandas is an awesome library for handling data. It is heavily used in the computer science community. Folium is a Python interface for the map building JavaScript library called Leaflet.js.

So, make a new file called 'mapgen.py' and input the following code:

import pandas as pd
import folium
import os
import sys
from folium.plugins import HeatMap

def main():
    # Read map data from CSV
    with open(sys.argv[1], 'r', encoding='utf8') as file:
        map_data = pd.read_csv(file)

    # Find highest purchase amount
    max_amount = float(map_data['amount'].max())

    # Makes a new map centered on the given location and zoom
    startingLocation = [-19.2, 146.8] # EDIT THIS WITH YOUR CITIES GPS COORDINATES!
    hmap = folium.Map(location=startingLocation, zoom_start=7)

    # Creates a heatmap element
    hm_wide = HeatMap( list(zip(map_data.lat.values, map_data.lon.values, map_data.amount.values)),
                        min_opacity=0.2,
                        max_val=max_amount,
                        radius=17, blur=15,
                        max_zoom=1)

    # Adds the heatmap element to the map
    hmap.add_child(hm_wide)

    # Saves the map to heatmap.hmtl
    hmap.save(os.path.join('.', 'heatmap.html'))

if __name__ == "__main__":
    main()

Ensure that you change the values for the variable in line 16. This is where the map will be focused when you open the generated webpage.

Once you've finished writing the code run:

python mapgen.py output.csv

Upon completion you will have a brand new file called “heatmap.html”. Open up that file and look at all the locations you have thrown your money away!

Unsurprisingly, most of my Google Pay purchases are for my lunch. From the map, can you guess where I have lunch? 😂

A gif showing the heat map in action.


Thanks for reading! If you have any tips or thoughts you'd like to share I'd love to hear them.

— Lenny

Edit 11/01/2019: I got some great feedback by people over on Reddit. So I've implemented some changes they suggested. This includes changing from camelCase to snake_case to be more inline with Python standards. Splitting the code that extracts the payment info into its own function. Finally I broke each of the chained methods I use to extract payment information onto multiple lines to make it easier to follow.

EDIT March 2021: I have updated the scrapePayements.py file as Google has changed the format of the webpage slightly. It should run properly again. However, I have noticed that I don't have any payments with GPS coordinates since the end of 2019. I am unsure if Google still takes GPS coordinates when you pay, or if they just relate it back to their constant tracking of the person.


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.

New Blog!

I have created this blog as I have felt for quite some time it would be neat to do. I am still working out the ropes of this platform so bare with me for any blunders I make.

With this blog I hope to be able to share insights I stumble across or cool projects I do, as well as leaving a record that I can look back on in the future and (hopefully) feel proud of what I've accomplished.

I think it would be cool to also use this platform to share some of the stories I have written. My initial thought is to make a separate user for each story and then people can subscribe to the individual stories so they can keep up with chapters as they come out.

— Lenny


If you have something to say don't forget to tag me (@wyrm@wyrm.one) so I can respond!

Or email me at blog (at) lennys (dot) quest.