< Back to the archive

Like what you see? Subscribe here and get it every week in your inbox!

Issue #204 - February 5, 2023

If you are looking for work, check out this month's Who is hiring?, Who wants to be hired? and Freelancer? Seeking Freelancer? threads.

Here are the top threads of the week, happy reading!

Top comment by mtlynch

Staying warm.

In the winter, I used to stay warm by turning up the thermostat. Then I discovered (via HN) the Low-Tech Magazine article, "Insulation: first the body, then the home." [0] The article argued that it's much more efficient to focus on heating yourself rather than your whole living space.

I invested in high-quality wool clothes that I wear in layers and warm slippers. Now, I keep my home about 5 degrees F cooler than I used to for the same comfort, and it's a big reduction in oil and wood consumption for home heat.

[0] https://www.lowtechmagazine.com/2011/02/body-insulation-ther...

Top comment by stjo

Andrej Karpathy's "Neural Networks: From Zero to Hero". https://karpathy.ai/zero-to-hero.html

Just watch the first lecture and you won't be able to not watch the rest. It starts with making your own autograd engine in 100 lines of python, similar to PyTorch and then builds up to a GPT network. He's one of the best in the field, founder of OpenAI, then Director of AI at Tesla. Nothing like the scam tutorials that just copy-paste random code from the internet.

Top comment by RobinL

Try and write any complex SQL as a series of semantically meaningful CTEs. Test each part of the CTE pipeline with an in.parquet and an expected_out.parquet (or in.csv and out.csv if you have simple datatypes, so it works better with git). And similarly test larger parts of the pipeline with 'in' and 'expected_out' files.

If you use DuckDB to run the tests, you can reference those files as if they were tables (select * from 'in.parquet'), and the tests will run extremely fast

One challenge if you're using Spark is that test can be frustratingly slow to run. One possible solution (that I use myself) is to run most tests using DuckDB, and only e.g. the overall test using Spark SQL.

I've used the above strategy with PyTest, but I'm not sure conceptually it's particularly sensitive to the programming language/testrunner you use.

Also I have no idea whether this is good practice - it's just something that seemed to work well for me.

The approach with csvs can be nice because your customers can review these files for correctness (they may be the owners of the metric), without them needing to be coders. They just need to confirm in.csv should result in expected_out.csv.

If it makes it more readable you can also inline the 'in' and 'expected_out' data e.g. as a list of dicts and pass into DuckDB as a pandas dataframe

One gotya is SQL does not guarantee order so you need to somehow sort or otherwise ensure your tests are robust to this

Top comment by kjellsbells

Having the 5G logo on your phone doesnt necessarily mean your phone is speaking 5G. It's...complicated.

- Most US operators do not have ubiquitous 5G radio coverage, so, they do what is called 5G NSA where your traffic ultimately ends up veing processed by the 4G core infrastructure that they have. Which is already overloaded and a bit creaky.

- Some operators can do real 5G, "5G SA", where the whole flow end to end runs on 5G infrastructure, but whether this is faster than LTE or not depends on the spectrum band in use. Verizon have some high bandwidth spectrum, but it doesnt propagate as well as lower (and slower) frequencies like what Tmo has. But if you expect to get those blazing multi gig speeds, you really have to be on that high bandwidth stuff. For most people, most of the time, they wont, so they wont see much improvement over LTE.

- I assert that it is dawning on operators that consumers are not interested in paying $10 extra per month for 5G. This is a bit of a problem when those same operators spent or borrowed billions to obtain the spectrum in the first place. Did I mention that the era of cheap money is now over and those debt payments are due?

- Telcos desperately need a killer app or use case that drives 5G adoption. They havent got one. And remember, the app must be one that telcos can monetize. They still have scars from their failure to capture the value of smartphone applications in the LTE era.

Top comment by jodoherty

I live in a region with a lot of government contracting businesses, so Red Hat Enterprise Linux is something I have to maintain a working familiarity with.

However, I use Debian for all of my personal projects and infrastructure.

The reason? There's no for-profit corporate interest directly controlling the project. The project's organizational structure resembles a constitutional democracy:

https://www.debian.org/intro/organization

There is an incorporated entity in the United States to handle a number of intellectual property and financial concerns:

https://www.spi-inc.org/projects/debian/

However, it exists as a non-profit with a very narrowly defined, specific set of purposes:

https://www.spi-inc.org/corporate/certificate-of-incorporati...

Because of this, I feel like the Debian project has a good combination of people and resources, making it easy to rely on long-term, but without the for-profit corporate interests that may conflict with my own in the future.

Top comment by dang

Recent and related:

John Carmack’s ‘Different Path’ to Artificial General Intelligence - https://news.ycombinator.com/item?id=34637650 - Feb 2023 (402 comments)

Top comment by phphphphp

I am not a tax expert but my understanding is that a business would claim expenditure as R&D because of the beneficial tax treatment: it's a choice you make to categorise expenditure as R&D, you're under no obligation to do so. If the tax treatment of R&D spend has changed to be less favourable in your circumstance (i.e: you can't afford the short term cost of amortisation) then you would not claim the spend to be R&D related. After all, a small technology company working on their revenue-generating product is not doing anything experimental: it's only experimental if you massage it as such.

I could be far off the mark -- so please correct me if I am wrong -- but your framing suggests that if a business spends money on software development then they must amortise the cost which does not seem to be correct.

Top comment by nfriedly

For what it's worth, I used to do work on freelancing sites ~10 years ago and just tended to ignore the race to the bottom. I initially charged $50/hr, and regularly raised my rates so that I was up over $150/hr by the time I stopped doing it.

Of course I got passed over for a lot of jobs in favor of cheaper folks. But the jobs I did get were from clients who actually respected me. Also, more than once, a client who initially passed over me for someone cheaper came back a few months later and asked me to do the job after all.

So, perhaps something like that could work for you.

--

Regarding the $1,200 per month limit, I'm not sure what the rules are, but perhaps you could set up a corporation that takes on the freelancing jobs and then pays you a salary of $1,200 a month? That way you wouldn't have to turn down a job for paying too much.

Maybe have the corp owned by a trust rather than you personally?

I wouldn't want you to get in trouble and lose the disability, though, so talk to somebody who actually knows what they're talking about before doing any of this stuff.

Top comment by com2kid

JSON Schema is awesome. I wish Typescript had better support for it though, having to do stuff in Zod and JSON Scheme sucks.

I have a system I built that compiles TS types to JSON schema, which then validates data coming into my endpoints, this way I am typesafe at compile time (Typescript API) but if someone hits my REST endpoint w/o using my library, I still get runtime goodness.

The number of different ways that JSON schema can be programmatically generated and therefore expressed is a bit high, different tools generate very different JSON Schemas.

Also the error messages JSON schema gives back are kind of trash, then again the JSON Schema on one of our endpoints is over 200KB in size.

Top comment by capableweb

Reach out to various existing infrastructure projects that build and maintain large community-owned P2P WiFi deployments, the two largest ones are Freifunk and Guifi. They would surely be able to and wanting to help you, if you reach out to them. Try https://freifunk.net/en/contact/ and https://matrix.guifi.net/

Most of them are using commodity hardware from Mikrotik, Teltonika and Ubiquiti. Basic setup for a personal node is just an antenna + router. Then they usually have the concept of "supernodes" who are responsible for hooking up multiple personal nodes, and have uni-directional antennas + multiple ones + bigger routers to facilitate the routing.

I'm not sure you'll be able to put together a supernode with decent range for under $100 though, think the cost would be more than that, but I would be happy to be proven wrong.

In terms of firmware, I've almost exclusively seen OpenWRT being used (and the rest running default Mikrotik/Ubiquiti firmware), with various self-made patches done to it before installing it on the hardware.