< Back to the archive

Like what you see? Subscribe here and get it every week in your inbox!

Issue #37 - November 17, 2019

Here are the top threads of the week, happy reading!

Top comment by seveibar

Postgres -> Metabase

I believe this is the best combination of cheap/powerful for early-stage startups. My very non-technical cofounder is able to use metabase's simple GUI interface to create graphs/insights (even joining and aggregating across tables!), and for anything complex I can step in a give a helper SQL query. We have around 10M records we aggregate around for daily insights.

Metabase can also be run as a pseudo-desktop/web application to save additional cash (we don't do this though).

Top comment by christophilus

No. Tests are like any other code. They incur technical debt and bugs at the same rate as other code. They also introduce friction to the development process. As your test suite grows, your dev process often begins to slow down unless you apply additional work to grease the wheels, which is yet another often unmeasured cost of testing in this fashion.

So, in short, I view tests as a super useful, but over-applied tool. I want my tests to deliver high enough value to warrant their ongoing maintenance and costs. That means I don't write nearly as many tests as I used to (in my own projects), and far fewer than my peers.

Where I work, tests are practically mandated for everything, and a full CI run takes hours, even when distributed across 20 machines. Anecdotally, I've worked for companies that test super heavily, and I've worked for companies that had no automated tests at all. (They tested manually before releases.) The ratio of production issues across all of my jobs is roughly flat.

This issue tends to trigger people. It's like religion or global warming or any other hot-button issue. It would be interesting to try to come up with some statistical analysis of the costs / benefit of automated tests.

Top comment by aaron-santos

What didn't work:

Shipping pickled models to other teams.

Deploying Sagemaker endpoints (too costly).

Requiring editing of config files to deploy endpoints.

What did work:

Shipping http endpoints.

Deriving api documentation from model docstrings.

Deploying lambdas (less costly than Sagemaker endpoints).

Writing a ~150 line python script to pickle the model, save a requirements.txt, some api metadata, and test input/output data.

Continuous deployment (after model is saved no manual intervention if model response matches output data).

Top comment by peterwoerner

Because there is a real moat with data ownership and pipelines. If you want to do any analysis you quickly find that learning to properly use scikit-learn and tensorflow (or your machine learning algorithm of your choice) is atleast an order of magnitude lower of work than getting the data. For instance, I wanted to build a machine learning algorithm which took simple data from the SEC filed 10-Q and 10-K, which are freely available online and predict whether stocks were likely to outperform the market average for over the next 3 years.

Time to setup up scikit learn and tensorflow algorithms to make predictions: 4 hours. Time to setup python scripts which could parse through the excel spreadsheets, figure out which row corresponded to gross profit margin, and a few other "standard" metrics: ??? unknown because I gave up at about 80 hours trying to figure out rules to process all the different spreadsheets and how names were determined.

I had a professor who was doing machine learning + chemistry. He was building up his own personal database for machine learning. He spent ~5 years with about 500 computers building the database so that he would be able to do the actual the machine learning.

Top comment by spc476

One technique I learned in 5th grade is called "brainstorming." It's a simple concept---you get a group together (or you can do this alone---you just skip one step). You establish a clear problem statement. Then for 10 or 15 minutes, everybody writes down ideas to solve the problem. Nothing detailed, just a few words per idea. And no judgement at this time. Just ideas, no matter how silly ("feed mayonnaise to tuna fish" for example) until time is out.

Then one person starts reading their list, and everybody checks their lists for that idea and cross it out. Then the next person goes, reading any remaining ideas and so on until all you have left are unique ideas (and it doesn't matter if all your ideas are crossed out---remember, no judgements yet). If you are alone, you can skip this step.

Then, and only then, do you go through the final list of ideas and discuss them. Here, you judge the ideas, reject some, combine others, mix, match, and puree until you get something that works.

Top comment by jraph

This comes up again and again, but a personal solution until things move is to disable Javascript on Medium. Then, you are presented with a fully working page, without any distraction. Just disable JS on Medium. It "gracefully improves".

Top comment by hprotagonist

Any base model brother laser printer, preferably any that support wifi printing.

i’ve had one for 7 years, and replaced the toner exactly once.

Top comment by fitzroy

Instead of asking for each site, just allow first-party cookies and delete them by default when the last tab of that domain is closed. The user should be able to favorite cookies to keep indefinitely, with the rest being cleared on a user-defined schedule (onTabClose, 1 hour, 24 hours, 1 week, etc). There was a free Safari extension called Safari Cookies that handled the favoriting but it stopped working several years ago. https://sweetpproductions.com/safaricookies/index.htm

I'm surprised this isn't a standard feature built into browsers. Seems like it would be obvious to have a level of granularity between accept all first-party cookies and accept none.

Edit: to clarify, I don't think setting cookies is the issue (and not worth the UX hassle to ask everytime); the issue is storing the cookies for longer than the interaction persists. To me, it's analogous to someone remembering who you are during a conversation vs adding you to their rolodex and storing that info indefinitely.

Top comment by mihemihe

I have been doing the same during the last 20 years approximately, getting home and:

- Browse internet (Reddit, hacker news, or whatever site that was popular back then) - Gaming - Learning some IT stuff, either related to my job or not - Coding

Few months ago, I have started to get out of the screen and doing something else because that way of living was literally killing me. I got extremely lazy, a bit overweight, atrophied muscles, back starting to bend, procrastinating more often, having brain fog more often, not having enough social life, losing communication skills because of not having contact with other humans.

I started running, going to the gym, doing weights, pinging friends to go visit them, reading books not related with IT, going to organized running events during weekends, cooking, and other activities, going to smoke outside the house/hotel and definitely allocating some time for my old past-times(internet, gaming, coding, learning). This has completely changed my life:

- I feel more energetic, sleep better and wake up fresher. - I reached my ideal weight - I am getting my muscles toned and increased flexibility - I have have found joy on doing things not attached to a keyboard and mouse (like cooking) - I have improved the relationship with my friends. - And in general I am happier than before.

I do not know how old are you, or if your question was intended to look for fancy things to do in your free time, but, I can tell you honestly:

Please, DO NOT stay sit in a chair after work in an isolated environment for the next years to come. Hit the gym, reconnect with friends, look for alternative hobbies, and reduce the time dedicated to the screen. Your 20-years older self will thank you.

The only thing I have not been able to achieve is stop smoking. Many tries, no success, but I hope I will achieve this by the end of the year.