https://lbrito1.github.io/blog/A Developer's Notebook2023-12-26T23:30:00ZLeonardo Britohttps://lbrito1.github.iotag:lbrito1.github.io,2023-12-26:/blog/2023/12/leaving-amazon.htmlLeaving Amazon2023-12-26T23:30:00Z2023-12-26T23:30:00Z<div class="image-box stretch">
<div>
<a href="/2023/12/leaving-amazon.html">
<img class="lazy" data-src="/blog/assets/images/2023/leaving-amazon-sm.jpg" alt="AI-generated image, parody of office workers vs remote workers. Oil painting in the style of Hieronymous Bosch.">
<noscript>
<img src="/blog/assets/images/2023/leaving-amazon-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">OpenAI is so much fun!</div>
</div>
<p>I’d like to preface this by stating that Amazon is obviously a huge company, and my opinions are just that, one person’s opinions. There will probably be some people that share my frustrations while others have had a completely different experience.</p>
<p>I interviewed at AWS in early 2020, pre-pandemic. The interview process is grueling and I spent considerable effort preparing for the 5-hours-long pantomime of absurd algorithms trivia and “tell me a time when you said no” behavioral questions. COVID-induced visa processing delays pushed my start date forward in time many times. The high-stress interview process and years-spanning wait built up tremendous anticipation. In hindsight I can say I probably had somewhat unrealistic expectations when finally joining the company in late 2021.</p>
<p>Regardless, I was quite frankly shocked after my first couple of weeks, and my first impression was that this kind of work was not for me. As a software engineer, I expected to eventually do some software engineering. I’m not sure how to describe the work that first team I joined was doing, but I can’t in good conscious call it software engineering.</p>
<!-- more -->
<p>The bulk of the work consisted of updating configuration files and attending meetings with other teams where the driver would recite what was written in project management cards. No one seemed to really deeply know what the team was doing (or maybe I was too thick to understand it!), and quite frankly no one seemed to care very much. Having worked on very product-centric, fast-paced startups before, this kind of apathetic, complacent attitude was bizarre to me. If you browse Blind for more than five minutes, though, you will see for yourself that this kind of experience is far from being an exception.</p>
<p>I found my way out of that team as fast as I could. The ability to effortlessly change teams is something truly praiseworthy at Amazon. I got no pushback from my former manager. The whole process is extremely simple and streamlined.</p>
<p>Although the new team was far more interesting from an engineering perspective and my coworkers much more motivated, it was still an Amazon team. Over time I would understand that while Amazon has a great diversity of teams, they’re all playing the same tune. These are things that bothered me, in no particular order:</p>
<ul>
<li>Big companies have a way of codifying their customs and culture into lists of commandments (<a href="https://nymag.com/intelligencer/article/ray-dalio-rob-copeland-the-fund-book-excerpt.html">1</a>, <a href="https://www.kochagenergy.com/marketbasedmanagement/">2</a>, <a href="https://www.amazon.jobs/content/en/our-workplace/leadership-principles">3</a>) that are often vague and contradictory. No one has figured out yet how to make true believers out of new hires, but you are expected to at least pay lip service to their moral code. In the day-to-day work life of rank-and-file employees, these dogmas are at best a hurdle that regular folk have to navigate around, and at worst become weapons to justify doing or not doing whatever you want.</li>
<li>Tooling. No matter how good an external solution is, there is an internal tool that is twice as difficult to use and half as good. This isn’t any developer’s fault, by the way, and almost always is a side effect of either legal requirements (software licensing), legacy decisions made long ago, or empire building. More on that later.</li>
<li>It is a large company where decisions flow from top to bottom. These decisions are to be accepted as facts of life. There is no conversation, no room for debate, and no one to appeal to. No exception, no accommodation. Scream against the wind all you want, write a petition with 30,000 signatures - doesn’t matter.</li>
<li>Big ships take a long time to turn around. Everything moves slowly and there is little space for experimentation. Do you have a good idea? Maybe something dozens of others also agree is a good idea? Good luck getting that into any roadmap.</li>
<li>Empire building. This follows almost as a direct result of the previous two points. Higher-level people decide things and those things get built, regardless of whether its a good idea <a href="https://www.cnbc.com/2023/10/04/amazon-shuts-down-amp-live-audio-service.html">or not</a>. Promotions are vastly politics-driven, and large orgs seem to form around who has more political clout than any technical or business realities.</li>
<li>Apathy towards the craft. I struggled to find passionate, highly motivated and competent engineers at Amazon. I met a few excellent engineers, but most seemed to only punch in and out, with near-zero interest in anything related to programming or technology outside of working hours.</li>
<li>Misaligned incentives with the Leadership Principles. LPs are always the strongest force directing everything at work; engineers are trained to think about how their work artifacts align with the LPs, <em>not</em> on the quality or usefulness of what they are doing, which is usually left as an “if we have time for that” afterthought.</li>
<li>Unstable, untrustworthy leadership. I was truly blessed to work with first-grade direct managers. Higher level leadership, however, had some truly bizarre behavior the last few years, like the poorly-communicated, seemingly endless waves of layoffs or the disastrous return-to-office and return-to-hub policies.</li>
<li>Collaboration culture. While at any other job I would expect to be able to find a subject-matter expert on something, contact them and maybe do some quick pair programming, this is highly frowned upon at Amazon. There are good reasons for that: the company is too big for this kind of fast, informal interaction, and SMEs of highly demanded services would be quickly be overwhelmed by DMs if this were a thing. Moats are built to prevent that; instead of just asking someone about something, many times you will have to file a ticket and wait a week or more for office hours, which ends up being about as useful as DIY research in the internal platforms.</li>
</ul>
<p>Some of the above is to be expected in any large company, and I could live with many of those issues. The one thing that convinced me that I needed to leave, though, was the stuck-in-quicksand feeling that working at Amazon was a career terminus. This was the first time in my career I felt that my day job was actively working against me, making me a <em>worse</em> software developer. There’s a choice to be made: keep up with the industry or keep up with Amazon’s internal technologies and tooling.</p>
<p>There was no new technology or innovative engineering process for me to learn. Simple problems become complicated ones because of the internal tooling overhead, leaving little time to work on more interesting things.</p>
<p>Going full circle, I want to restate the above is just my experience. I’m sure there are teams, maybe many of them, that contradict some or all of what I listed. I’ve never had demanding on-calls, for instance, which are an extremely common complaint at Amazon (I’ve personally seen friends and acquaintances leave a dinner party after being paged). To those who enjoy working there I wish only the best.</p>
tag:lbrito1.github.io,2022-07-15:/blog/2022/07/books-sapiens.htmlBook review - Sapiens2022-07-15T17:54:50Z2022-07-15T17:54:50Z<div class="image-box stretch">
<div>
<a href="/2022/07/books-sapiens.html">
<img class="lazy" data-src="/blog/assets/images/2022/obelisk.jpg" alt="Obelisk scene from the movie 2001">
<noscript>
<img src="/blog/assets/images/2022/obelisk.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">Sapiens in an image.</div>
</div>
<p>Sapiens has no central point being made; rather there’s an intricate web of mostly interdependent theories and speculations. This makes for an enjoyable read, however at times it is easy to lose sight of the original premises used to build up on increasingly speculative conclusions.</p>
<!-- more -->
<p>His reflections on early humanity, what the author calls <a href="https://en.wikipedia.org/wiki/Behavioral_modernity">Cognitive Revolution</a> and the unforeseen consequences of the <a href="https://en.wikipedia.org/wiki/Neolithic_Revolution">Agricultural Revolution</a> are the most interesting. Homo Sapiens’ edge over the competing species was the ability to operate in shared imagined worlds, which eventually developed into modern states, complex economies and religions. These shared worlds allow large numbers of individuals to cooperate towards large projects unfeasible to smaller bands. And while this might have hugely increased human population, it also meant worse living conditions for the average person: grain-based civilizations are more susceptible to famines, droughts and plagues than their hunter-gatherer predecessors.</p>
<p>Less interesting are the speculations on the future of humanity in the later, basically envisioning a very prolonged or eternal life based on vague hopes of medical breakthroughs/transhumanism. Its really just a very tired rehash of <a href="https://en.wikipedia.org/wiki/Tree_of_life">ancient myths</a>. Weirdly, Harari kind of implicitly admits this by citing those same myths and insinuating that his hopes are of the same vein or kin.</p>
tag:lbrito1.github.io,2022-07-13:/blog/2022/07/books-collapse.htmlBook review - Collapse2022-07-13T17:54:50Z2022-07-13T17:54:50Z
<div class="image-box stretch">
<div>
<a href="/2022/07/books-collapse.html">
<img class="lazy" data-src="/blog/assets/images/2022/viking-greenland-sm.jpg" alt="Hvalsey church ruins, Greenland">
<noscript>
<img src="/blog/assets/images/2022/viking-greenland-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">Hvalsey church ruins, Greenland. Credit: https://en.wikipedia.org/wiki/User:Number_57</div>
</div>
<p><em>Collapse</em> is a fascinating, if somewhat exhausting, read. The central point of the book is that environmental changes, man-made or not, have been responsible for many a civilization’s collapse.</p>
<!-- more -->
<p>In the year 2022 that claim might perhaps sound obvious, but the fascinating part is learning <em>how</em> these stories unfold, often with unexpected twists and many unintended consequences. For instance, medieval Scandinavians found in Greenland a scenery not unlike their native northern Europe, and proceeded to build a society over there replicating what they had back home. This worked for a while – centuries, in fact – but proved ultimately unsustainable and disastrous. An overview of modern Montana’s environmental and economic woes sends the message: what happened before might be happening again.</p>
<p>Less brilliant are the chapters regarding Central America, specifically the chapter on Haiti. It glosses through the island’s modern history while failing to mention the single most important historical fact about it: the <a href="https://theconversation.com/when-france-extorted-haiti-the-greatest-heist-in-history-137949">extortion of Haiti by France</a>, in which France, backed by the United States, demanded indemnity by the loss of property (including the enslaved Haitians themselves), militarily extorting the tiny island with warships in probably modern history’s greatest heist. <a href="https://en.wikipedia.org/wiki/Haiti_indemnity_controversy">The last payment was made only in 1947</a>. Given the minutiae of other historical facts included in the book, which are trivia by comparison, this jarring gap can only be seen as intentional.</p>
<p>As for the previously mentioned exhaustion: the book gets its point across clearly long before its final chapters, which in turn seem a bit redundant and repetitive.</p>
<p>Jared Diamond is a somewhat controversial scholar, often accused of <a href="https://en.wikipedia.org/wiki/Environmental_determinism">geographic determinism</a>. There is plenty of self-justification of the contrary in <em>Collapse</em>, which makes you wonder.</p>
<p>Be that as it may, and even with the embarrassing failings mentioned above, <em>Collapse</em> is an excellent read.</p>
tag:lbrito1.github.io,2022-07-10:/blog/2022/07/next-dalle.htmlDALL·E minis of the future won't be fun2022-07-10T23:30:00Z2022-07-10T23:30:00Z<div class="image-box stretch">
<div>
<a href="/2022/07/next-dalle.html">
<img class="lazy" data-src="/blog/assets/images/2022/stalin-drone-sm.jpg" alt="">
<noscript>
<img src="/blog/assets/images/2022/stalin-drone-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>I’ve been playing with <a href="https://huggingface.co/spaces/dalle-mini/dalle-mini">dalle-mini</a> the last few weeks. Part of what makes it fun to play with are the bizarre and obtuse outputs. They reached that sweet spot between laughably bad and frighteningly perfect: they’re good enough to be understood and enjoyed, basically.</p>
<p>I think that incompleteness is part of what makes it so amusing to toy with these things, and conversely what will make future versions much less fun.</p>
<!-- more -->
<p>Dalle-mini is, as the name implies, much smaller than dall-e, using 0.4 billion parameters instead of 12 billion. Dall-e isn’t entirely publicly available so dall-e mini is more of an reconstruction/reverse engineering effort rather than just a toned down version (extracted from <a href="https://wandb.ai/dalle-mini/dalle-mini/reports/DALL-E-Mini-Explained--Vmlldzo4NjIxODA#the-dall-e-experiment">their website</a>).</p>
<p>Future iterations will be far more impressive. Consider the <a href="https://simplified.com/blog/ai/dall-e-1-vs-dall-e-2/">difference between dall-e and dall-e 2</a>, for instance:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2022/dalle-1-2.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2022/dalle-1-2.jpg" alt="">
<noscript>
<img src="/blog/assets/images/2022/dalle-1-2.jpg" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>As they inch closer to perfection – perfection being “indistinguishable from human-made”, by the way – these models will surely be widely commercialized, and eventually easily available.</p>
<p>I think playing with a near-perfect “future dalle” will be as fun as comparing baking soda packaging at the grocery store. These models will become just another tool. I can see them being widely used to generate birthday card designs, obliterating the probably small niche of birthday card designers. I can’t see them being the next Bruegel the Elder or Hieronymus Bosch.</p>
<p>As these models become increasingly commoditized, they will blend in with the art industry and eventually be forgotten by the public like all novelties eventually are. An AI-generated corporate décor at the office will be as bland and uninteresting as a human-made one. Eventually the distinction won’t matter. Right now, though, it is hard to mistake a dalle-mini output as being created by a human, and I think that is part of its appeal.</p>
<p>Think about the sample inputs that OpenAI gives their model, like “an astronaut riding a horse in photorealistic style”. We don’t really need a near-perfect AI model to amuse ourselves with that thought – our brains are enough.</p>
<p>Perhaps something similar or analogous happened with videogames. Early 3D games would leave much up to the imagination of the player; the closer games get to perfect graphical realism, however, the more boring and uninteresting they seem (to me). I think the wave of indie games with intentionally “bad” graphics is no accident - there is an undeniable appeal of leaving things out.</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2022/obra-dinn.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2022/obra-dinn.jpg" alt="Screenshot of the video game Return of the Obra Dinn, showing a black-and-white 3D environment.">
<noscript>
<img src="/blog/assets/images/2022/obra-dinn.jpg" alt="Screenshot of the video game Return of the Obra Dinn, showing a black-and-white 3D environment.">
</noscript>
</a>
</div>
<div class="image-caption">Return of the Obra Dinn (https://en.wikipedia.org/wiki/Return_of_the_Obra_Dinn)</div>
</div>
<p>This boils down to what I think is a fundamental misconception about AI held by many people. AI doesn’t (and cannot) <em>create</em> anything, it just mixes and matches (in very smart ways) things created by humans. The training sets are immense, and the techniques are increasingly complex, but unless there is some radical, foundational pivot, AI is and will be ultimately derivative of human creations. And derivation is ultimately boring. Right now most people still don’t quite grasp this ontological difference between “real” intelligence, which creates things, and AI, which doesn’t (some will claim there really is no difference and humans must operate in the same way, but I won’t get into that).</p>
<p>This misunderstanding leads to pretty funny situations, like the Google engineer that <a href="https://www.washingtonpost.com/technology/2022/06/11/google-ai-lamda-blake-lemoine/">claimed a chatbot was sentient</a>. <a href="https://www.youtube.com/watch?v=iBouACLc-hw">Its not</a>. It mixes texts in a smart way. I think people will eventually catch up to this, and when that happens, the improved “mini-dalle” of the future won’t be nearly as interesting or amusing as its less perfect predecessors.</p>
tag:lbrito1.github.io,2022-06-12:/blog/2022/06/tesla_dystopia.htmlTeslas are a dystopia2022-06-12T01:08:41Z2022-06-12T01:08:41Z<div class="image-box stretch">
<div>
<a href="/2022/06/tesla_dystopia.html">
<img class="lazy" data-src="/blog/assets/images/2022/tesla-sm.jpg" alt="A Tesla inside a tunnel.">
<noscript>
<img src="/blog/assets/images/2022/tesla-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">This could have been a subway.</div>
</div>
<p>Since moving to Vancouver metro I’m continuously surprised with how common electric vehicles have become. Some may see the rise of EVs as an exciting turn towards a futuristic <a href="https://en.wikipedia.org/wiki/Solarpunk">solarpunk</a> utopia. I see the opposite: they are a dystopia of sorts. They are a dead end, a waste of resources in the wrong direction, a false hope.</p>
<p>The proliferation of personal electric vehicles is a strong marker of failure. It is a manifestation in the physical world of our inability as a society to move on from a clearly failed, car-centric way of living. Teslas kind of epitomize this – despite being just another very expensive car, despite <a href="https://www.theguardian.com/world/2022/may/27/tesla-catches-fire-vancouver-canada-investigation">catching fire</a> <a href="https://www.msn.com/en-us/news/us/sac-metro-fire-challenged-by-burning-tesla/ar-AAYnuYV">every</a> <a href="https://www.msn.com/en-us/autos/news/nhtsa-steps-up-tesla-investigation-of-phantom-braking-crashes-into-emergency-vehicles/ar-AAYhkQk">now</a> <a href="https://fox2now.com/news/illinois/tesla-catches-fire-on-route-3-in-brooklyn-illinois/">and then</a>, despite <a href="https://www.thedrive.com/news/38579/these-repair-bulletins-for-teslas-quality-problems-are-downright-embarrassing-and-serious">bizarre QC issues</a>, despite its <a href="https://www.truthorfiction.com/did-elon-musk-tweet-we-will-coup-whoever-we-want-deal-with-it/">nitwit CEO</a>, still they are seen as cool and fashionable and trendy.</p>
<!-- more -->
<p>EVs are just an incremental improvement on cars. This painful reality hit me when I first <em>heard</em> an EV whooshing down the street: I couldn’t really tell it apart from any other vehicle. Over the years I had read so many hyped articles about how EVs were so quiet that they needed artificial sounds to warn pedestrians that I expected nothing less than library-level quietness from an EV. Instead I heard the same deafening roar as any other car.</p>
<p>It turns out that if you put a tonne of metal on rubber tires going on asphalt, that rubber-on-surface sound is likely the dominant noise source at any appreciable speed:</p>
<blockquote>"Tire-pavement interaction noise (TPIN) dominates for passenger vehicles with the speed of above 40 km/h and for trucks with the speed of 70 km/h."</blockquote>
<p><a href="https://www.researchgate.net/publication/328012268_Literature_review_of_tire-pavement_interaction_noise_and_reduction_approaches">Literature review of tire-pavement interaction noise and reduction approaches, Tan Li, 2018</a></p>
<p>As speed increases, it quickly doesn’t matter if you even have an engine at all, as aerodynamic noise also factors in:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2022/car-noise.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2022/car-noise.png" alt="Chart with noise sources of cars by speed. At 15mph or so, propulsion noise gives way to tire-pavement noise as the largest source.">
<noscript>
<img src="/blog/assets/images/2022/car-noise.png" alt="Chart with noise sources of cars by speed. At 15mph or so, propulsion noise gives way to tire-pavement noise as the largest source.">
</noscript>
</a>
</div>
<div class="image-caption">Past 15mph or so, propulsion noise is no longer the largest noise source.</div>
</div>
<p>So the sound will be the same. So will other things: the parking space it requires is the same, the urban sprawl it stimulates is the same, the accidents are the same (<a href="https://jalopnik.com/teslas-navigate-is-worse-than-human-driving-consumer-r-1834944173">or worse</a>). Everything that matters is basically the same.</p>
<p>For all the <a href="https://thedriven.io/2021/12/21/tesla-model-3-and-model-y-get-dancing-lights-in-2021-holiday-update/">distracting “features”</a> of these high-tech gizmos, in essence <em>they are still just cars</em>. No matter how much high tech is embedded into them, they are ontologically the same as the first Model T.</p>
<p>The most obvious improvement of EVs is in energy efficiency/pollution. Ironically this seems to be a very old pro-car argument. In <a href="https://www.goodreads.com/book/show/30833.The_Death_and_Life_of_Great_American_Cities"><em>The Death and Life of Great American Cities</em></a>, written over half a century ago, Jane Jacobs notes that early car proponents defended the personal vehicle as <em>cleaner</em> than its predecessor – horses. Internal combustion engines don’t poop, so the streets were in fact cleaner than before. That is all fair and just; she argued that the issue, however, was that horses were being replaced with <em>far too many</em> cars – in other words, the one-car-per-person model of North America. Essentially the same argument about cleanliness is being repeated now, a century later, with EVs.</p>
<p>Pollution spewing from combustion engines, although a serious issue, was never the only nor the greatest problem caused by cars (and EVs <a href="https://www.euronews.com/green/2022/02/01/south-america-s-lithium-fields-reveal-the-dark-side-of-our-electric-future">still generate plenty pollution</a>, just not inside the engine, but in power plants and factories). It is disingenuous to pretend otherwise.</p>
<p>The rise of EVs is so disheartening because it is a huge missed opportunity and waste of resources. Even more so because it is being <a href="https://www.cnn.com/2021/09/08/business/biden-uaw-electric-vehicles-climate/index.html">pushed by</a> and within rich Western economies, that tend to lead the way to the rest of the world. Climate change, failing cities and fossil fuel scarcity could have catalyzed a new era of major public transit infrastructure projects and a shift in zoning practices towards sane densification. But those things are hard – culturally and politically – and people will do basically anything to avoid having to do them.</p>
<p>EVs are the perfect kludge that enables us to do nothing. People get a clean conscience thinking they are “doing something”, while not really addressing the underlying issue (and <a href="https://www.instituteforenergyresearch.org/renewable/the-environmental-impact-of-lithium-batteries/">creating some brand new problems</a> as well). <a href="https://bc.ctvnews.ca/vancouver-residents-rally-against-broadway-plan-1.5893686">Selfish NIMBYism</a> and infinite sprawl can carry on unbothered.</p>
tag:lbrito1.github.io,2022-04-13:/blog/2022/04/digital_nomads.htmlAre "digital nomad visas" a thing yet?2022-04-13T08:00:00Z2022-04-13T08:00:00Z<div class="image-box stretch">
<div>
<a href="/2022/04/digital_nomads.html">
<img class="lazy" data-src="/blog/assets/images/2022/cloud_opt.jpg" alt="A foggy morning in Coquitlam, Canada.">
<noscript>
<img src="/blog/assets/images/2022/cloud_opt.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Immigration sucks. In addition to the personal toll it takes on anyone, it is also mind-numbingly tedious and baroquely complex. Why aren’t things <strong>better</strong> by now?</p>
<p>This reminded me of the so-called “digital nomad visas”. Searching for that term will get you a thousand clickbaitey Wordpress sites with the “20 best countries with nomad visas” or whatever. But how real are they <em>really</em>?</p>
<!-- more -->
<p>I’ll admit the term sounds pretty cool. It sounds like something you’ll be doing on your phone, as opposed to over the hundreds of PDF files and scanned documents a normal application will demand.</p>
<p>In fact, the name is so enticing I’m pretty sure <strong>that’s the whole point</strong>. It is so radically opposite to idea of traditional visas that people can’t help but think that <em>they are</em> opposites.</p>
<p>The “digital nomad” part leads one to believe that the stamp is somehow catered towards tech workers with remote jobs; thus, being directed at that demographic, it makes sense to assume that the visa has the characteristics that make it <em>more desireable to them</em> than traditional visas – otherwise, what is the point of these visas? In short, I think this is what most people tend to think when meeting with the term for the first time:</p>
<table>
<thead>
<tr>
<th> </th>
<th>traditional visa</th>
<th>digital nomad visa</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Requirements</strong></td>
<td>Many</td>
<td>Few</td>
</tr>
<tr>
<td><strong>Application process</strong></td>
<td>Hard, time-consuming, complex, lots of paper documents and legalese</td>
<td>Easy, fast, simple, lots of e-documents</td>
</tr>
<tr>
<td><strong>Time</strong></td>
<td>Measured in aeons</td>
<td>Measured in days</td>
</tr>
<tr>
<td><strong>Path to full residency</strong></td>
<td>Tortuous</td>
<td>Streamlined</td>
</tr>
<tr>
<td><strong>Uniqueness</strong></td>
<td>Specific to each country</td>
<td>Same/similar between countries</td>
</tr>
<tr>
<td><strong>Time limits</strong></td>
<td>Constrained</td>
<td>Unconstrained</td>
</tr>
<tr>
<td><strong>Coolness</strong></td>
<td>Minivan heading to football practice</td>
<td>Convertible doing doughnuts</td>
</tr>
</tbody>
</table>
<p><strong>What people think is going on ☝️</strong></p>
<p>What really goes on is that these digital thingies are almost always just a rebranding of old the same old temporary visitor visas. Now this is where things get interesting: <strong>who is advertising</strong> these visas as “digital nomad visas”? At first I assumed governments were doing the rebranding, but that is not always the case. There are some countries that do use the specific wording “digital nomad” when describing these visas, like <a href="https://home.kpmg/xx/en/home/insights/2021/09/flash-alert-2021-240.html">Greece’s Law 4825/2021</a> and more famously <a href="https://www.e-resident.gov.ee/nomadvisa/">Estonia</a>; however, in most cases most of the publicity seems to come from elsewhere.</p>
<p>For instance, take the <a href="https://www.vfsglobal.com/portugal/Brazil/pdf/D7.pdf">Portuguese D7 visa</a>, which is very specifically meant for members of religious orders, retirees and people with significant passive income, but is touted as a “digital nomad visa”:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2022/portugal-d7-form.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2022/portugal-d7-form.jpg" alt="PDF form saying that this visa is meant for retirees and members of religious orders.">
<noscript>
<img src="/blog/assets/images/2022/portugal-d7-form.jpg" alt="PDF form saying that this visa is meant for retirees and members of religious orders.">
</noscript>
</a>
</div>
<div class="image-caption">Official D7 requisition form. It literally says it is meant for retirees, members of religious orders and people living off passive income.</div>
</div>
<p>Now if you search for “portugal digital nomad visa”, guess which visa you are lead to believe is tailored for, well… digital nomads?</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2022/portugal-d7.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2022/portugal-d7.jpg" alt="Search results for 'digital nomad visa portugal' showing the D7 visa in.">
<noscript>
<img src="/blog/assets/images/2022/portugal-d7.jpg" alt="Search results for 'digital nomad visa portugal' showing the D7 visa in.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>I’m not saying that the people above that are connecting “digital nomad” to the D7 visa are wrong in any way – they might be totally correct. In fact, among the search results there are so many law firms, consultancies, blogs and vlogs selling the D7 as exactly that that they are probably right. Have they found a loophole? Am I missing something – perhaps all remote workers are really nuns and priests in disguise? Do I just suck at googling and missed something obvious?</p>
<p>Be that as it may, one thing I would love to see is how many people successfully live as digital nomads with that visa.</p>
<p>Anyway, my point is that the <strong>publicity</strong> linking that visa to digital nomadism is coming from those <em>very interested parties</em>, and <em>not</em> the government of Portugal. In short, they have skin in the game, and you can bet they are making money off of that connection.</p>
<p>I bet if someone investigated other countries touted as having “digital nomad visas” they would find similar results.</p>
<p>Back to the original question, when are these visas going to be a thing?</p>
<p>In a sense they already are, because these are old visas under new guises – be it the government or third parties doing the rebranding. Now the <em>idea</em> we have of how they ought to be – the happy second column in the table above – that’s something else, and I can’t see it becoming a thing, ever.</p>
<p>No matter what shiny wrapping you put around immigration, it is still immigration. Countries that have painful processes have them for a reason – put simply there are more people willing to go through the process than they need, so they can be picky. That fundamental reality isn’t going to change just because we have this new expectation due to this shiny new term.</p>
<p>All countries have the same basic incentive for getting more of these “digital nomads”: high salaries, which means more taxes and consumption. How important that might be varies between countries, which explains the list of countries that have some kind of “digital nomad” visa – the larger world economies are usually not on those lists, and when some of them eventually are, the requirements are so stringent that they’re not really any different from normal temporary work or rich-person visa.</p>
<p>The closest thing I’ve seen to a “true” digital nomad visa is <a href="https://www.e-resident.gov.ee/nomadvisa/">Estonia’s</a>. They seem truly committed to the same ideals we described, and the process seems fairly straightforward. Its the exception that confirms the rule, sadly.</p>
tag:lbrito1.github.io,2021-12-30:/blog/2021/12/botched_interviews.htmlBotched interviews2021-12-30T12:34:36Z2021-12-30T12:34:36Z<div class="image-box stretch">
<div>
<a href="/2021/12/botched_interviews.html">
<img class="lazy" data-src="/blog/assets/images/2021/puerto-varas-sm.jpg" alt="Sunset in Puerto Varas.">
<noscript>
<img src="/blog/assets/images/2021/puerto-varas-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Here’s something I’ve been wanting to write for a while: all the times (the ones I can remember, anyway) I bombed a software engineer job interview. There are so many “how I aced interviewing at X”/”how to pass X interview” floating around that I thought the opposite story would make for an amusing read.</p>
<p>My first developer job was as an intern at a big tech company in 2012. I think that was one of the worst interviews I’ve had, by the way – I could barely understand the interviewer over the cellphone, and those were the days of “how many piano players are there in New York”-kind of questions. I thought it went terrible, but I got the job somehow. On the other hand I’ve had many interviews I thought I did great but bombed anyway.</p>
<!-- more -->
<h2 id="faang-1-2013">FAANG 1 (2013)</h2>
<p>I was just out of school when I got a cold call from a FAANG recruiter asking if I was interested in interviewing there. Needless to say I was naive and didn’t quite know what I was getting myself into.</p>
<p>The first part of the process was a phone screen with a technical recruiter. She asked me something like “What is the fastest way to sum two 32-bit integers?”. I froze like a deer in the headlights.</p>
<p>After gasping in awe at the bizarre question and mumbling some nonsense, I was informed that my answer was <em>not</em> correct. The <em>right</em> answer, the recruiter said, was “using the CPU’s TLB”. The <a href="https://en.wikipedia.org/wiki/Translation_lookaside_buffer">translation lookaside buffer</a> is a memory cache located between the CPU and the CPU cache. I still had college classes in my somewhat-recent memory at the time, so I kind of knew that this thing existed, but to this day I still don’t know how to sum two integers with it.</p>
<h2 id="faang-1-2014">FAANG 1 (2014)</h2>
<p>A year passed and (the same) FAANG came to my town with a recruitment event. Again I got a call, and instead of a phone screening, I would go straight to the event location and do a quick onsite interview.</p>
<p>They provided recommendations on technical subjects I should refresh my memory about: basically Algorithms 101 syllabus; sorting algorithms, red/black and AVL trees, A* and Dijkstra, NP-complete problems, etc, as well as some operating systems topics: processes, threads, mutexes, scheduling algorithms and so on.</p>
<p>At the time I had just started grad school. I was a bit more seasoned than the last interview, but far from having any relevant industry experience.</p>
<p>Interview day. I got to the venue and sat reading my notes on how to balance AVL trees. The interviewers showed up, greeted me and got things started. They gave me pen and paper and asked me how to some things on a list of integers. I don’t recall the interview being particularly bad or anything, but round one was the end of the line for me. A few days later I got a boilerplate rejection email and that was my last contact with this FAANG.</p>
<h2 id="mid-sized-tech-2014">Mid-sized tech (2014)</h2>
<p>Grad school classes were few and far apart, so I decided to start looking for a job. This mid-sized tech company had a local office, so I got in touch and scheduled an interview.</p>
<p>I don’t remember any meaningful details from this interview. One thing I do remember is being asked what monthly compensation I expected. The interviewer passed me a scrap of paper and a pen for me to write the number down. I took a few moments to think and wrote down what I thought was an adequate number. In today’s US dollars, that number would be enough to afford a parking spot in San Francisco, but it was an okay salary for a junior hire in my town.</p>
<p>I didn’t get a lot of feedback here other than “we went with someone else”.</p>
<h2 id="faang-2-2019">FAANG 2 (2019)</h2>
<p>A few years had passed since my last failure. I had finished my education, become a Ruby developer and enjoyed a 3.5-years tenure at a great, small local software studio. I had a preferred text editor. I had dotfiles. I felt weathered. So I did what you’re supposed to do: I applied to FAANG.</p>
<p>FAANG responded to my application, and after a couple of <em>months</em> and being ghosted by one of the 5+ recruiters involved in the process, things got on track for the onsite.</p>
<p>I did the Leetcode thing daily. I read Glassdoor tips and talked to college friends that worked at this company. I even went on Blind and found out that you’re basically an idiot if you don’t pass FAANG’s interview, cause it’s so damn easy.</p>
<p>I passed the initial screening rituals and was invited for an onsite – 25 hours and 3 flights away. As I obsessively went through my notes in the hotel, I thought I was finally <em>ready</em>.</p>
<p>There were maybe 5 or 6 rounds, each with one or two interviewers, all very nice and moderately helpful. Pretty much textbook FAANG interview, just like CTCI describes it. I was asked a particularly difficult set of questions. Mid-interview, I got the familiar feeling that a train is steaming towards me and I’ll soon be smashed into smithereens. Looking back, I’d evaluate myself at this interview as Not Good.</p>
<p>Another 25 hours, 2-stop travel back to my town. Feedback came swiftly in the familiar lukewarm rejection phone call.</p>
<h2 id="big-tech-2019">Big Tech (2019)</h2>
<p>This Big Tech company is highly respected in the Ruby community and their culture seemed to align with my own. I tried, and failed, their interview process two times.</p>
<p>They have a fairly straightforward interview process: first a phone screen/short code challenge, then a longer behavioral/technical challenge, typically onsite. First time, I didn’t make it to the onsite.</p>
<p>My second attempt was much smoother and lead to the onsite. Like <em>FAANG 1 (2019)</em>, this involved multiple flights and 20+ hours of travel. Also like my previous interview, I <em>felt ready</em>. The recruiter was great, and every person I met so far was extremely nice. Company culture seemed fantastic and I liked the tech stack.</p>
<p>There were two technical rounds, one cultural fit conversation and one technical-but-not-coding round. The coding parts went well, maybe a B+. The culture fit thing was very good. The not-coding part, ironically the one I had prepared the most, was a disaster, although I didn’t see things that way at the time.</p>
<p>I came prepared to talk about one of the projects I lead in my current job at the time; evidently I had an NDA in place and had to navigate around it. I thought this was fine – I had already published a post on the same subject on the company’s public blog and never had any complaints about it being too esoteric/abstract. The interviewer was not amused by this at all. Because of the vagueness of the language I was using, he seemed to think I was describing something shady. At one point, he actually asked something like “do your users know you are doing this”! Right now I think that was kind of funny, but I was utterly bewildered at the time. I tried to reassure him that there was nothing fishy going on, but at that point he had probably made his mind. I might as well have gone straight to the airport and saved everyone some time.</p>
<p>Rejection call came a week later. Upon my request, the recruiter followed with a very thorough email detailing the reasons of the rejection, which is a fairly unusual thing for these companies to do, and very helpful for the candidate. Although I think the interviewer could have handled the situation better (just ask me to describe another project), I have great respect for the way the company handled the process and gave honest feedback.</p>
<h2 id="the-printer-in-the-room">The printer in the room</h2>
<p>Hiring is the printer of software engineering jobs. It kind of works, but not very well, and everyone seems to agree it should be better at this point. This is in no way a demerit to recruiters – they’re doing their job and are not at fault here. They’re usually pretty good; its the framework that isn’t great. There have been some incremental improvements: code sharing platforms are pretty good, which makes remote interviewing very straightforward (and reduces the need for onsites, <a href="https://lbrito1.github.io/blog/2021/08/onsites.html">which suck</a>). There are many services where you build a single profile and apply to many companies at once, which reduces the time waste of filling the same forms in all the different companies’ websites. The bulk of the process remains more or less the same though.</p>
<p>Interviewing is in part a numbers game, but also not <em>entirely</em> random, which means there is a way to get better at it (that’s the whole point of companies like Leetcode and books like CTCI). Failing an interview you prepared for leaves a sour taste in the mouth, but over time it gets easier to accept as just part of the probability game.</p>
tag:lbrito1.github.io,2021-10-07:/blog/2021/09/job_offers_pandemic.htmlAnalyzing LinkedIn's data export: what happened in 2021?2021-10-07T14:45:00Z2021-10-07T14:45:00Z<div class="image-box stretch">
<div>
<a href="/2021/09/job_offers_pandemic.html">
<img class="lazy" data-src="/blog/assets/images/2021/linkedin-wordcloud.png" alt="Bar chart showing distribution of jobs I applied to per country. US ranks first with over 10 applications.">
<noscript>
<img src="/blog/assets/images/2021/linkedin-wordcloud.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>I’ve been using LinkedIn basically since I started working as an intern back in 2012. My usage is mostly limited to posting my blog posts, except the couple of times I used the platform to search for a new job. So most of the time, LinkedIn has been pretty slow-paced, with maybe half a dozen random recruiters reaching out per year.</p>
<p>However, since the Covid-19 pandemic started, and particularly in 2021, things seem to have gone a little crazy, with a <em>lot</em> more recruiter activity. I was curious to see just how much things had changed, so I looked at LinkedIn’s data export.</p>
<!-- more -->
<p>First I requested my data from LinkedIn:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/linkedin-request.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/linkedin-request.png" alt="">
<noscript>
<img src="/blog/assets/images/2021/linkedin-request.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Messages, Connections and Invitations seem like the most promising sources of data:</p>
<div class="highlight"><pre><code class="language-bash">➜ Basic_LinkedInDataExport_10-01-2021 <span class="nb">ls</span> <span class="nt">-lahtS</span>
total 1020K
<span class="nt">-rw-rw-r--</span> 1 leo leo 577K out 1 13:06 messages.csv
<span class="nt">-rw-rw-r--</span> 1 leo leo 146K out 1 13:05 Connections.csv
<span class="nt">-rw-rw-r--</span> 1 leo leo 113K out 1 13:05 Contacts.csv
<span class="nt">-rw-rw-r--</span> 1 leo leo 43K out 1 13:05 Learning.csv
<span class="nt">-rw-rw-r--</span> 1 leo leo 23K out 1 13:05 Invitations.csv</code></pre></div>
<p>The Connections export is somewhat limited for our purposes: I only actively add people on LinkedIn during a job search.</p>
<p>Messages are a bit more interesting because a lot of recruiters immediately offer a position in their first contact (sometimes even with a pre-scheduled Google calendar event! I wish things were this straightforward back when I was finishing school).</p>
<p>Invites are also a good source, complimentary to Messages. After accepting or rejecting an invite, the Invitation is deleted, so there’s no danger of double counting an interaction that started as Invitation and then evolved to Messaging.</p>
<p>Focusing first on the Messages export, here are some relevant info we might aspire to extract:</p>
<ul>
<li>Job offers (as in “I have a job I want you to apply for”) per date</li>
<li>Keywords mentioned in messages (“Ruby”, “Rails” etc)</li>
</ul>
<p>Let’s see if we can extract those.</p>
<h2 id="job-offers-per-month">Job offers per month</h2>
<p>Job offers mainly come from messages, and the bulk of my messages come from recruiters. However, I do get a few scattered personal messaging from old acquaintances, some professional but not interview-related conversations, etc.</p>
<p>A simple approach to estimate how many messages are actually from someone promoting a job opening is to look for certain job-related terms: in my case, as a Ruby engineer, if a message contains “Ruby” it is probably from a recruiter advertising a Ruby-related position. This is only an estimate: maybe I chatted about Ruby at some point with an acquaintance, which of course is non-related to our objective here. Those cases are few and far apart compared to the recruiter conversations though.</p>
<p>With that in mind, I built a list of terms that are related to job searches:</p>
<div class="highlight"><pre><code class="language-ruby"><span class="n">job</span> <span class="n">offer</span> <span class="n">opportunity</span> <span class="n">ruby</span> <span class="n">developer</span> <span class="n">engineer</span> <span class="n">talent</span> <span class="n">salary</span> <span class="n">relocation</span> <span class="n">position</span> <span class="n">role</span> <span class="n">recruiter</span> <span class="n">talent</span> <span class="n">looking</span> <span class="n">interested</span> <span class="n">oportunidade</span> <span class="n">trabalho</span> <span class="n">vaga</span> <span class="n">software</span> <span class="n">experience</span> <span class="n">tech</span> <span class="n">rails</span> <span class="n">interesse</span> <span class="n">interested</span> <span class="n">company</span> <span class="n">email</span> <span class="n">work</span> <span class="n">senior</span> <span class="n">contato</span> <span class="n">vagas</span> <span class="n">remote</span> <span class="n">working</span> <span class="n">stack</span> <span class="n">backend</span> <span class="n">technical</span> <span class="n">developers</span> <span class="n">skill</span> <span class="n">skills</span></code></pre></div>
<p>Most of the messaging I get is in English, but I do get a significant amount of contacts in Portuguese as well, so we have terms in both languages.</p>
<p>With that list of terms, we can simply <code>select</code> all the relevant ones and group those by month:</p>
<div class="highlight"><pre><code class="language-ruby"> <span class="k">def</span> <span class="nf">job_messages_per_month</span>
<span class="n">job_related_messages</span> <span class="o">=</span> <span class="vi">@input</span><span class="p">.</span><span class="nf">select</span> <span class="p">{</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span> <span class="n">row</span><span class="p">[</span><span class="s2">"CONTENT"</span><span class="p">]</span> <span class="o">=~</span> <span class="sr">/</span><span class="si">#{</span><span class="vi">@relevant_words</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"|"</span><span class="p">)</span><span class="si">}</span><span class="sr">/i</span> <span class="p">}</span>
<span class="n">metric_per_month</span><span class="p">(</span><span class="n">job_related_messages</span><span class="p">)</span> <span class="p">{</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span> <span class="no">Time</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="s2">"DATE"</span><span class="p">]).</span><span class="nf">to_date</span> <span class="p">}</span>
<span class="k">end</span></code></pre></div>
<p>Now, when I’m not actively looking for another job, I tend not do look at LinkedIn too much, so the Invitations tend to pile up. As already mentioned, accepted/rejected invites get “deleted” from LinkedIn’s data export (which doesn’t seem like a great practice IMO, as they probably still have that data), so only invites that you haven’t acted on either way are available in the CSV export.</p>
<p>Just like with messages, we group the relevant invites (“Inbound”, meaning someone is adding you as opposed to “Outbound” where you’re adding someone) by date. I didn’t bother filtering by terms because nearly everyone that adds me is a recruiter these days:</p>
<div class="highlight"><pre><code class="language-ruby"> <span class="k">def</span> <span class="nf">recruiters_per_month</span>
<span class="n">received_invites</span> <span class="o">=</span> <span class="vi">@input</span><span class="p">.</span><span class="nf">select</span> <span class="p">{</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span> <span class="n">row</span><span class="p">[</span><span class="s2">"Direction"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"INCOMING"</span> <span class="p">}</span>
<span class="n">metric_per_month</span><span class="p">(</span><span class="n">received_invites</span><span class="p">)</span> <span class="p">{</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span> <span class="no">Time</span><span class="p">.</span><span class="nf">strptime</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="s2">"Sent At"</span><span class="p">],</span> <span class="s2">"%m/%d/%y"</span><span class="p">).</span><span class="nf">to_date</span> <span class="p">}</span>
<span class="k">end</span></code></pre></div>
<p>Here’s the result of summing both data, messages and invites:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/linkedin-offers-month.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/linkedin-offers-month.png" alt="">
<noscript>
<img src="/blog/assets/images/2021/linkedin-offers-month.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>There was a spike in early 2019, when I actively pursued a new job. I also gave a <a href="http://www.thedevelopersconference.com.br/tdc/2019/florianopolis/trilha-web-frontend">conference talk</a> at that time and added a bunch of people on LinkedIn. Thus, this peak in job offers is just a consequence of me actively looking for a job. After that, activity dropped to back to lower levels (I also ticked “not looking for a job” on LinkedIn right after I signed the offer at my new job around April 2019).</p>
<p>On the other hand, since late 2020 job offer messaging has grown steadily. I wasn’t actively looking, so this here is organic growth. I’m curious to see if other people also had a similar pattern. Perhaps this is a reflection of an increase of demand in some specific skillset (Ruby) or experience level (X years of experience), but I’m guessing its part of a general upwards trend in the industry since the beginning of the pandemic.</p>
<h2 id="most-common-terms">Most common terms</h2>
<p>Another interesting piece of information is the most common words mentioned in the conversations.</p>
<p>We could just count how frequently each word pops up, but irrelevant words like “the”, “a” and so on would rank in the top. So first we need to get rid of those words, then look at the linted text. I’m sure there’s an API that does just that somewhere out there, but I created my own list of non-relevant words manually and used that instead.</p>
<div class="highlight"><pre><code class="language-ruby"> <span class="k">def</span> <span class="nf">word_frequencies</span>
<span class="n">full_text</span> <span class="o">=</span> <span class="vi">@input</span><span class="p">.</span><span class="nf">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span> <span class="n">row</span><span class="p">[</span><span class="s2">"CONTENT"</span><span class="p">]</span> <span class="p">}.</span><span class="nf">compact</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">" "</span><span class="p">)</span>
<span class="n">normalize_words</span><span class="p">(</span><span class="n">full_text</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sr">/\s+/</span><span class="p">))</span>
<span class="p">.</span><span class="nf">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">w</span><span class="o">|</span> <span class="n">w</span><span class="p">.</span><span class="nf">gsub</span><span class="p">(</span><span class="sr">/[^a-z]+/</span><span class="p">,</span> <span class="s2">""</span><span class="p">)</span> <span class="p">}</span>
<span class="p">.</span><span class="nf">reject</span> <span class="p">{</span> <span class="o">|</span><span class="n">w</span><span class="o">|</span> <span class="n">w</span><span class="p">.</span><span class="nf">size</span> <span class="o"><</span> <span class="mi">3</span> <span class="o">||</span> <span class="vi">@nonrelevant_words</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">w</span><span class="p">)</span> <span class="p">}</span>
<span class="p">.</span><span class="nf">group_count</span><span class="p">.</span><span class="nf">sort_by</span><span class="p">{</span> <span class="o">|</span><span class="n">a</span><span class="o">|</span> <span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="p">}.</span><span class="nf">reverse</span>
<span class="k">end</span></code></pre></div>
<p>Using the <a href="https://github.com/zverok/magic_cloud">MagicCloud</a> gem, here’s the plotted results:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/linkedin-wordcloud.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/linkedin-wordcloud.png" alt="">
<noscript>
<img src="/blog/assets/images/2021/linkedin-wordcloud.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>No surprises there – terms like “Ruby” and “Rails” are among the most frequent. Other bland job-related terms compose the bulk of the word cloud.</p>
<p>Here are the actual numbers for these terms:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/linkedin-terms.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/linkedin-terms.png" alt="">
<noscript>
<img src="/blog/assets/images/2021/linkedin-terms.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<h2 id="what-happenedis-happening">What happened/is happening?</h2>
<p>Going back to the original question: what happened in 2021? Did software-related job offers explode during the pandemic? My anecdata points to an obvious Yes. Most articles discussing this question are just opinion pieces around the lines of “engineering demand increased because digital services sharply expanded with the lockdowns”. I couldn’t find any hard data supporting these assumptions (other than this personal analysis presented above).</p>
<p>One of the major impacts of the pandemic, being forced to remote-only did have pretty obvious effects in non-US job markets. Before the pandemic, I suspect that many very competent people were hesitant to leave their jobs due to strictly non-remote perks: nice offices, work colleagues, local benefits like healthcare, maybe even specific labor laws regarding vacation time and so on.</p>
<p>Remote jobs based in strong currency countries, especially the US, were already “a thing” long before the pandemic, but with remote work being mandatory rather than an option, local remote vs foreign remote boils down to a huge pay gap in most cases, with US-based software engineering salaries being hard to compete with anywhere else in the world.</p>
<p>I’m very curious to see how this pans out for local software shops. These local companies are really bleeding talent leaving for stronger currencies; if these dynamics go on for too long, I can’t see how most of them will last.</p>
<h2 id="code">Code</h2>
<p><a href="https://github.com/lbrito1/LinkedIn-insights">Here’s the repo</a> with all the scripts needed to reproduce these results.</p>
tag:lbrito1.github.io,2021-08-27:/blog/2021/08/onsites.htmlOnsites considered harmful2021-08-27T21:33:00Z2021-08-27T21:33:00Z<div class="image-box stretch">
<div>
<a href="/2021/08/onsites.html">
<img class="lazy" data-src="/blog/assets/images/2021/onsite-sm.jpg" alt="">
<noscript>
<img src="/blog/assets/images/2021/onsite-sm.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>A couple of years ago I interviewed at one of the largest Ruby shops out there. Screening went well, and some days later I was invited for an onsite.</p>
<p>These were the good old pre-covid days, so an onsite really meant <em>onsite</em>. You had to travel to the office, wherever that was.</p>
<p>The thing is, an onsite is actually radically different depending on <em>where you live</em>. It follows that onsites introduce further bias into our industry’s already problematic hiring process. I’d like to argue that although onsites have some advantages, they’re mostly a waste of time (and money).</p>
<!-- more -->
<p>If you’re a local, an onsite probably means taking a bus, metro, taxi, walking, whatever. If you’re not local but are at least within the same country that you’re interviewing, it might take a day trip or maybe a short flight. If you’re a foreigner it might take travel visas and a week of travel.</p>
<p>Without getting too philosophical, we all know we have a limited amount of sand grains in our hourglass. <a href="https://www.wikiwand.com/en/Sunk_cost">Fallacies apart</a>, anyone can <em>feel</em> that the more we pour into something – be it renovating a kitchen or traveling for an onsite – the higher the stakes become.</p>
<p>Its glaringly obvious that someone who invested a 30 minute bus ride to an onsite will be much more at ease than someone who flew godless hours on a red-eye. It doesn’t really matter how much pampering the latter is treated with: exquisite hotels, meal allowances… investing a week of your time will always drive up anxiety a lot more than taking an afternoon off work.</p>
<p>Back to my story: I was interviewing for a company overseas. I happened to have a valid visa for that country, which already puts me in some advantage compared to others. Physically getting to the company building for the interview, however, took some effort: I drove to the local airport, where I arrived <a href="https://www.theonion.com/dad-suggests-arriving-at-airport-14-hours-early-1819573933">more than a couple of hours early</a>, flew down to São Paulo, then took two more flights until I reached my final destination, some <em>thirty hours</em> after I stepped out of my house, then I took another cab to the fancy hotel booked by my not-to-be future company and collapsed for the night.</p>
<p>Next day I had the onsite (which took basically the full business hours), then back to the hotel, collapsed again. On the third day I backtracked through the 30 hours of cabs, airports and flights back home. This was late December, by the way, so airports were <em>packed</em>. A couple of days later, on Christmas eve, I got a very thoughtful <em>happy holidays + no thanks</em> call from the recruiter.</p>
<p>Might I have gotten the job if I had taken a bus instead of multiple planes? Maybe, maybe not (probably not, since someone in the interview panel just didn’t enjoy my parlance).</p>
<p>That isn’t really the point, though, and as far as <em>anecdata</em> goes, I have the opposite story as well: I interviewed twice at the same company, once through a tortuous voyage similar to the one I described above, and another time at my city, with the roles reversed: I left my house and drove for a few minutes to the onsite, while the interviewers were enjoying a fancy beachside hotel after several plane trips. I failed the former and got an offer from the latter. I’m the same person applying for the same job at the same company. Did I just perform better? Did they perform “worse”? Is it all just a coincidence? Are all of these interviews meaningless hazing rituals?</p>
<p>But I digress. Back to the matter at hand: if you think of it, the <em>60 hours</em> of commuting alone is more than one work week (and as far as effort goes, I’m sure more goes into enduring 60 hours of planes and airports than into programming). If you factor in the actual onsite, then we’re talking about <em>two workweeks</em> of effort put into a no-strings-attached situation. The elapsed wall time is well into a full business week.</p>
<p>There are sensible, rational grounds for an onsite. Recruiters want to know if the candidate hates the cold and is going to churn early winter, or maybe the city is too small, or too big, or whatever random factor might make people want to run away. That said, I find it hard to believe even the most prescient can get a read on any of those thoughts rushing through the candidate (probably even the candidate can’t).</p>
<p>In any case, are those things worth the several thousand dollars of expenses, and more importantly, are they worth excluding a possibly large pool of candidates that aren’t willing to invest a full week of their time on a process with <a href="https://blog.interviewing.io/technical-interview-performance-is-kind-of-arbitrary-heres-the-data/">naturally low chances of going forward</a>? At least in my opinion, those very thin pros are outshined by the very real cons like the <code>-8.208527</code> Sun outshines my laptop screen.</p>
<p>Now let us also remember that the success of job searches depends on arbitrary things like if everyone on the panel likes your face. We all know things shouldn’t be this way; we’re supposed to be unbiased and empathetic, but let’s face it – we humans <a href="https://www.sciencedaily.com/releases/2020/07/200714101228.htm"><em>suck</em></a> at that.</p>
<p>Even if we consider an utopically unbiased interviewer panel, there’s still all sorts of random noise going on at an interview, like performance anxiety. No matter how great the people interviewing are, and even how great <em>you</em> are, interviewing always has a huge degree of uncertainty:</p>
<blockquote>"The fact that people who are overall pretty strong (e.g. mean ~= 3) can mess up technical interviews as much as 22% of the time shows that there’s definitely room for improvement in the process"<a href="https://blog.interviewing.io/technical-interview-performance-is-kind-of-arbitrary-heres-the-data/">
<br>- Technical interview performance is kind of arbitrary. Here’s the data.</a>
</blockquote>
<p>My point is: such a volatile thing should <strong>never</strong> have been tied to multiple plane tickets and 2-night hotel stays in a different continent in the first place.</p>
<p><strong>Never</strong>.</p>
<p>Perhaps surprisingly, “onsites” are still a thing during the covid pandemic – they’re just remote, i.e., <em>not really onsites at all</em>.</p>
<p>This is immensely beneficial for everyone involved: the company won’t have to pay for expensive hotels and plane tickets, the planet won’t have to suffer the huge CO2 emissions from this ultimately unnecessary shenanigan, and the candidate won’t have to waste a week of his/her vacation time with something as ethereal as pursuing a software engineering job.</p>
<p>Recruitment in this industry is difficult. This is widely acknowledged by all parts involved. No wonder there are so many books, videos and Discord channels about interviewing for a tech job – not to mention coding prep services, automated third-party code challenges… the list goes on.</p>
<p>This post is specifically about onsites, but it is impossible not to mention the overall sad state of interviewing for a software engineer job. A quick survey of HN posts is enough to glimpse how people feel about this:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/hiring-1.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/hiring-1.png" alt="Search results for 'hiring is broken' on Algolia's HN search page.">
<noscript>
<img src="/blog/assets/images/2021/hiring-1.png" alt="Search results for 'hiring is broken' on Algolia's HN search page.">
</noscript>
</a>
</div>
<div class="image-caption">Sounds like hiring isn't in a great shape.</div>
</div>
<p>Inefficient and biased as it is (or, hopefully, <em>was</em>), physical onsites are nowhere near the worst possible interviewing practice we can observe in the wild.</p>
<p>Why interview for a job in a quiet office full of nerds if you can <strong>FIGHT FOR IT IN A TOURNAMENT</strong> like a geek gladiator?</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/hiring-tournament.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/hiring-tournament.jpg" alt="An email from a company called BairesDev inviting me to fight for a job in a coding tournament.">
<noscript>
<img src="/blog/assets/images/2021/hiring-tournament.jpg" alt="An email from a company called BairesDev inviting me to fight for a job in a coding tournament.">
</noscript>
</a>
</div>
<div class="image-caption">Yikes.</div>
</div>
<p>Things aren’t any better on the other side of the table – finding skilled developers in 2021 is tough, even if you’re not setting up a pair programming arena for a code to the death contest.</p>
<p>Some recruiters go a step beyond cold calling and start <em>cold referral calling</em>, like this recruiter asking me to <em>pleeeeeeeeeease</em> refer candidates that match their laundry list of requirements:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/hiring-pleeeeaase.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/hiring-pleeeeaase.png" alt="A recruiter message on Linkedin asking me to refer candidates, PLEEEEEEEASE.">
<noscript>
<img src="/blog/assets/images/2021/hiring-pleeeeaase.png" alt="A recruiter message on Linkedin asking me to refer candidates, PLEEEEEEEASE.">
</noscript>
</a>
</div>
<div class="image-caption">Pleeeeeeeeeease send me candidates that match my laundry list of skills!</div>
</div>
<p>My second point: recruitment is seriously hard for all parts involved. If we are able to, we should try to make it <em>easier</em>, not <em>harder</em>.</p>
<p>Onsites were a significant hassle on top of an already complicated, inefficient, time-consuming, stressful process.</p>
<p>Although the company usually takes on most of all of the financial hit, the time and emotional load was carried by the candidate alone.</p>
<p>Getting rid of physical onsites is <em>fantastic</em> news for everyone – especially people interviewing, but also companies that can now cast a wider net and carry out a faster, more diverse recruitment process. And our planet will also have God-knows-how-many thousand tonnes of CO2 less to deal with each year.</p>
tag:lbrito1.github.io,2021-08-20:/blog/2021/08/efficient_resource_distribution.htmlEfficient resource distribution2021-08-20T11:53:10Z2021-08-20T11:53:10Z<div class="image-box stretch">
<div>
<a href="/2021/08/efficient_resource_distribution.html">
<img class="lazy" data-src="/blog/assets/images/2021/resource-distribution.jpg" alt="">
<noscript>
<img src="/blog/assets/images/2021/resource-distribution.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<blockquote>
<p><strong>TLDR</strong> A simple metrics-based ranking system is good enough to decide who gets how many resources.</p>
</blockquote>
<p>Computational resources – CPU time, memory usage, network traffic etc – are limited. This may be more or less of a problem depending on project/company size and so on; if you’re working on a smaller product with limited traffic, it might not be meaningful at all.</p>
<p>Once past a certain threshold though, expenses with such resources become non-trivial and it begins to make sense to spend some time thinking about how to distribute them as efficiently as possible.</p>
<p>Here’s the problem that got me thinking about this: at work, we had a computational resource that needed to be consumed by a large fleet of workers (think several thousand concurrent), but each <em>type</em> of worker had different <em>productivity</em>, and that productivity changed over time. How can we decide who gets what?</p>
<!-- more -->
<p>So the problem is: you have a set of <em>consumers</em> that use the <em>same resource</em>, for which you have a static budget. The consumers all solve the same problem, more or less (i.e. have the same <em>output</em>), but come in different <em>types</em> that have different <em>productivities</em> (defined as <em>output per resource consumption</em>). Additionally, although the consumer types solve the same problem, we want consumers to be as diverse as possible – we can’t just pick the best performing one and go with that.</p>
<p>First thought that comes to mind is <em>this seems ordinary enough; there must be an easy, well-known solution</em>. There might be, but I couldn’t find any that was simple and effective for this use case. Closest I got were <a href="https://www.wikiwand.com/en/PID_controller">PID controllers</a>, which solve a similar problem, but probably doesn’t solve the entire problem here (and also seems complicated).</p>
<p>I gave the problem some thought and came up with a reasonable solution that has been working well for a year now.</p>
<p>The problem boils down to two parts:</p>
<ol>
<li>Consistently keeping track of productivity among the different consumer types;</li>
<li>Deciding how to share the resource among the consumers.</li>
</ol>
<p>The concept that glues both parts is that of the <em>cycle</em> – a repeating time period in which we measure productivity and distribute resources to be shared within that time frame, until the next cycle comes up and everything is recalculated.</p>
<p>Problem 1) boils down to maintaining a time series of how much output per resource each consumer type produced during the latest cycle.</p>
<p>Problem 2) comes almost as a corollary to the former problem: we want the best global output possible, and that can be guessed by using the productivity stats from the previous cycle. This won’t be perfect, because productivity varies over time within each consumer type, but basic <a href="https://www.wikiwand.com/en/Volatility_clustering">statistical intuition</a> says it will be good enough for our purposes.</p>
<p>So the first step of solving 2) is building a <em>ranking of consumers by productivity</em>. We want a diverse set of consumer types, though, so we can’t just pick <code>type #1</code> and give it 100% of the resources all the time. Also, the ranking might change each cycle, and we don’t want resource distribution to be too volatile – that might become hard to monitor and debug. We want something that is somewhat smooth, stable, convergent, but at the same time that reflects changes in productivity as quickly as possible, and that delivers good global output-per-resource-consumption.</p>
<p>We know that the top tier within the ranking probably deserves more than the the rest, while the bottom tier probably deserves less, and that is the gist of the solution to problem 2). We don’t know beforehand how each consumer will perform though, so it makes sense to start with equal resource distribution among them.</p>
<p>Here’s the complete solution I worked out:</p>
<p>Start the system by sharing an equal amount of resources among all consumers: let’s say every consumer has the same weight <em>W<sub>0</sub></em>.</p>
<p>Then, for each cycle:</p>
<ol>
<li>Build the productivity-per-consumer-class ranking</li>
<li>For the top <em>N%</em> consumers, do <em>W += K</em> (limited to a certain maximum)</li>
<li>For the bottom <em>N%</em> consumers, do <em>W -= K</em> (limited to a certain minimum)</li>
<li>Translate each <em>W</em> to a real-world resource amount (e.g. “1GB RAM” or something). This involves the weights as well as the global resource budget per time cycle, such that we guarantee we won’t exceed the static budget.</li>
</ol>
<p>The global sum of weights is kept more or less the same because we’re summing and subtracting the same amounts each cycle (although this isn’t perfect because we have min and max values), so the system is kept fairly stable over time while also reacting quickly to changes in productivity. Also, the system is robust, and blowing up the weights store is no big deal – weights will creep back to their previous values over a short time.</p>
<p>To finalize, here’s a chart showing the weights of different types of consumers over the last few months:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2021/grafana.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2021/grafana.jpg" alt="Dashboard showing a chart with several colored lines representing the weights of each consumer class.">
<noscript>
<img src="/blog/assets/images/2021/grafana.jpg" alt="Dashboard showing a chart with several colored lines representing the weights of each consumer class.">
</noscript>
</a>
</div>
<div class="image-caption">Consumer class weights over time.</div>
</div>
tag:lbrito1.github.io,2020-07-06:/blog/2020/07/replacing_google_analytics_android.htmlI replaced Google Analytics with a web server running on my phone2020-07-06T13:45:40Z2020-07-06T13:45:40Z<div class="image-box stretch">
<div>
<a href="/2020/07/replacing_google_analytics_android.html">
<img class="lazy" data-src="/blog/assets/images/2020/simple_diagram.png" alt="">
<noscript>
<img src="/blog/assets/images/2020/simple_diagram.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<blockquote>
<p><strong>TLDR</strong> I built <a href="https://github.com/lbrito1/android-analytics">android-analytics</a>, a web analytics tracker running on my phone.</p>
</blockquote>
<p>Say you run a blog, personal website, small-time business page or something of the sorts. Say you also want to keep an eye on how many visitors you’re getting.</p>
<p>The first thing that most people think at this point is “Google Analytics”. It mostly works and is free. Its also hosted by Google, which makes it very easy to start using. There aren’t many competitors that bring those points to the table, so Google Analytics usually wins by WO at this point.</p>
<p>I used to use Google Analytics to track this blog for those same reasons. But after finding out about <a href="https://termux.com">Termux</a> and writing <a href="https://lbrito1.github.io/blog/2020/02/repurposing-android.html">this post</a> about installing a web server on an Android phone, I started toying with the idea that I had this ARM-based, 2GB RAM, Linux-like device with Internet connectivity which must be more than enough for a simple webcounter-like application. After a few weeks of tinkering, here it is!</p>
<!-- more -->
<h2 id="table-of-contents">Table of Contents</h2>
<ul>
<li>
<a href="#motivation">Motivation</a>
<ul>
<li><a href="#why-even-keep-anything">Why even keep anything?</a></li>
<li><a href="#and-then-there-is-the-data">And then there is the data</a></li>
<li><a href="#the-lack-of-competition">The (lack of) competition</a></li>
</ul>
</li>
<li>
<a href="#developing-android-analytics">Developing android-analytics</a>
<ul>
<li><a href="#basis">Basis</a></li>
<li><a href="#first-iteration-sinatra-webapp">First iteration: Sinatra webapp</a></li>
<li><a href="#second-iteration-nginx-log-parser">Second iteration: Nginx log parser</a></li>
<li><a href="#third-iteration-adding-a-viewer">Third iteration: Adding a viewer</a></li>
<li><a href="#fourth-iteration-adding-an-installation-script">Fourth iteration: Adding an installation script</a></li>
<li><a href="#final-architecture">Final architecture</a></li>
</ul>
</li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<h2 id="motivation">Motivation</h2>
<h3 id="why-even-keep-anything">Why even keep anything?</h3>
<p>Before going into this whole thing, there’s a very reasonable question to be answered: why do I even need to collect this data?</p>
<p>The answer is simple: I really don’t, I just enjoy seeing it. Call it a <a href="https://techcrunch.com/2011/07/30/vanity-metrics/">vanity metric</a>, but I think its just <em>plain cool</em> to know that someone half across the planet read something I wrote months ago (maybe it was just a crawler; I’ll take it either way).</p>
<p>It should be no surprise, then, that Google Analytics always felt immensely overkill.</p>
<p>Its heartwarming to know that some nerd from Bhutan read one of my posts in the wee hours of the morning, but that is pretty much all I’m interested in. I could care less about Acquisition Treemaps, Audience Cohort Analysis or Behavior Flow. I’m not making those up: they’re all real products available on Google Analytics. I have no idea of what any of those mean, yet I’m 100% sure I don’t need them.</p>
<div class="image-box">
<div>
<a href="/blog/assets/images/2020/visitor_count.jpeg" target="_blank">
<img class="lazy" data-src="/blog/assets/images/2020/visitor_count.jpeg" alt="Visitor counter from the 90s.">
<noscript>
<img src="/blog/assets/images/2020/visitor_count.jpeg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">Visitor counter from the 90s.</div>
</div>
<p>What I wanted was closer to the late 90s’ visitor count GIF above (minus the embarrassment of publicity) than to the unsightly “Intersitial online advertising network conglomerate SEO dashboard” feeling of Google Analytics:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/ga.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/ga.png" alt="Google Analytics dashboard.">
<noscript>
<img src="/blog/assets/images/2020/ga.png" alt="Google Analytics dashboard.">
</noscript>
</a>
</div>
<div class="image-caption">Google Analytics dashboard.</div>
</div>
<p>In short, I wanted to geek out, not do advertisement arbitrage.</p>
<h3 id="and-then-there-is-the-data">And then there is the data</h3>
<p>As aforementioned, Google Analytics is great, free, <em>and hosted by Google</em>.</p>
<p>They keep your data. I have no idea of what they do with that data, or even what exactly it is that their tracker is sending to their servers (judging from the number of articles showing how to keep the payload below the cap of 8kb, it must be a lot).</p>
<div class="image-box">
<div>
<a href="/blog/assets/images/2020/ga_payload.png" target="_blank">
<img class="lazy" data-src="/blog/assets/images/2020/ga_payload.png" alt="Google search results for 'google analytics payload size is too large'. 642,000 results.">
<noscript>
<img src="/blog/assets/images/2020/ga_payload.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">That's a lot of results.</div>
</div>
<p>Apparently they often need over 8kb per request to feed their Lovecraftian “Audience Cohort Analysis” line of products. Fair enough, but I’m pretty sure that for my purposes, a several-kb payload is effectively using a sledgehammer to kill a fly.</p>
<p>By using Google Analytics I was willfully sending Google who-knows-what kind of data designed to build up people’s advertising profile. The page views of my blog probably didn’t help Google too much in that aspect, sure, but the principle of the whole thing still bothered me enough to do something about it.</p>
<h3 id="the-lack-of-competition">The (lack of) competition</h3>
<p>There are a lot of software similar to Google Analytics out there. The most prominent is probably <a href="https://matomo.org/">Matomo</a>, often <a href="https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=matomo&sort=byPopularity&type=story">posted on Hacker News</a>. It is free, open source and self-hosted (with cloud offerings for a monthly fee).</p>
<p>I would happily use Matomo, but with it comes a conundrum:</p>
<ul>
<li>Self-hosting implies I had to have some kind of publicly accessible Linux host, which would likely not be entirely free;</li>
<li>Cloud-hosting comes with a subscription fee.</li>
</ul>
<p>Those points are trivial if you’re running a lucrative business that <em>needs</em> analytics, but paying for this service sounds ludicrous when all you want is simple visitor stats for a personal blog.</p>
<h2 id="developing-android-analytics">Developing android-analytics</h2>
<p>These were the requirements I had for my tracker:</p>
<ol>
<li>Has to run on an old Android phone I have lying around;</li>
<li>Has to work with Github Pages-hosted sites;</li>
<li>Has a per-page view count;</li>
<li>Nice to have: geo info.</li>
</ol>
<p>These requirements are deceivingly simple, as I quickly learned.</p>
<p>Termux makes it really easy to run many kinds of software on your Android phone, and <a href="https://lbrito1.github.io/blog/2020/02/repurposing-android.html">I had already tinkered</a> with web servers with Termux. For something as simple as a page view, this should be pretty straightforward.</p>
<p>I had also already registered a dynamic DNS subdomain pointing to my phone, so it was ready to accept incoming traffic from the Internet.</p>
<p>The first major roadblock I faced was getting my Android-hosted web server to communicate with Github Pages. After a couple of days of research, I finally learned that it is basically impossible to make a request from an HTTPS website (which Github Pages is) to an HTTP address (my Dynamic DNS’s subdomain). To summarize, you can make that work, but at the cost of having the client browser do something (like actively mark a “allow mixed content” checkbox somewhere in the browser’s flags/advanced options).</p>
<p>This lead me to the excruciating path of obtaining and using a verified SSL certificate in my Android phone with a Dynamic DNS subdomain. This took me long enough to want to write a separate <a href="https://lbrito1.github.io/blog/2020/06/free_https_home_server.html">blog post</a> about it. The TLDR here is that it is entirely possible to get a verified SSL cert for a Dynamic DNS subdomain – all of it entirely for free. Depending on your ISP, you’ll have different choices of SSL challenges, but if you’re able to receive TCP requests on port <code>443</code>, it is possible to get the certificate for free.</p>
<p>Once I figured out the SSL thing, the rest was pretty much a breeze.</p>
<h3 id="fundamentals">Fundamentals</h3>
<p>I tried out a few different ideas when developing this, but the overall architecture is always the same:</p>
<ul>
<li>JavaScript code in my tracked page calls the Android host;</li>
<li>Android host saves that information in a database;</li>
<li>Some graphical tool is used to parse that data into something viewable (charts etc).</li>
</ul>
<h3 id="first-iteration-sinatra-webapp">First iteration: Sinatra webapp</h3>
<p>I started with a <a href="http://sinatrarb.com/">Sinatra</a> webapp with a single <code>POST</code> endpoint that would receive a request from the tracked page and immediately save it in a Postgres database. I used Nginx as a reverse-proxy that handled traffic before passing it to Sinatra.</p>
<p>This approach had the merit of being simple to understand and reliable. Also, it worked.</p>
<p>But after watching it work for a few days, I realized that the whole webapp part was superfluous. Nginx logs all accesses by default, and the logs contain all the information I need: what page was requested, at what time and from what IP. This lead naturally to the second iteration.</p>
<h3 id="second-iteration-nginx-log-parser">Second iteration: Nginx log parser</h3>
<p>Nginx provides flexible, per-endpoint logs: logs are activated for the endpoint that I want (<code>/damn_fine_coffee</code>) and deactivated for everything else. This is important because the Internet is full of crawlers that annoyingly hit the root path <code>/</code>, which obviously shouldn’t count as a page view. As I learned, the web is also surprisingly full of smartypants trying to make their way into <code>/tp-link</code>, <code>/admin</code> and so on; I also wanted to just ignore those.</p>
<p>The logs provided all the <em>data</em> I needed, but I still needed to transform that <em>data</em> into useful <em>information</em>. I found out about <a href="https://goaccess.io/">GoAccess</a> on Hacker News, and, perhaps surprisingly, it worked out of the box with Termux:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/goaccess.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/goaccess.png" alt="GoAccess dashboard with my Android-hosted data.">
<noscript>
<img src="/blog/assets/images/2020/goaccess.png" alt="GoAccess dashboard with my Android-hosted data.">
</noscript>
</a>
</div>
<div class="image-caption">GoAccess dashboard with my Android-hosted data.</div>
</div>
<p>At this point I could settle for GoAccess, but it didn’t seem to provide any geo info, which I always thought would be a cool feature, so I kept working on my own tool.</p>
<p>I configured Nginx to print CSV-like logs, and <a href="https://github.com/lbrito1/android-analytics/blob/master/app/compiler.rb">wrote a parser</a> that transforms those log entries into DB entries with geographic information provided by the excellent <a href="https://github.com/alexreisner/geocoder">geocoder</a> gem, and also annonymizes the request IPs using MD5 hashing. The final step was adding a cron entry to run the parser regularly.</p>
<p>At this point I was getting regular traffic converted to rows in a Postgresql table. I still needed a more convenient way to look at the data, though.</p>
<h3 id="third-iteration-adding-a-viewer">Third iteration: Adding a viewer</h3>
<p>I initially thought about using <a href="https://grafana.com/">Grafana</a> as a visualization tool. Its free, easy to use, flexible and I was already familiar with it. Unfortunately Grafana doesn’t have binaries available for Termux (there’s an <a href="https://github.com/termux/termux-packages/issues/4801">issue</a> open in Termux’s repo requesting that), and I wasn’t feeling like trying to compile it manually.</p>
<p>Thankfully I found the <a href="https://github.com/ankane/blazer">blazer</a> gem, which has a very similar concept compared with Grafana: you write SQL queries and it transforms them into charts. That was exactly what I was looking for. The downside is that it requires a full-fledged Rails application to run, but I was okay with that trade-off.</p>
<p>Here’s how the data looks like right now:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/android-analytics-screenshot.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/android-analytics-screenshot.png" alt="blazer gem dashboard.">
<noscript>
<img src="/blog/assets/images/2020/android-analytics-screenshot.png" alt="blazer gem dashboard.">
</noscript>
</a>
</div>
<div class="image-caption">blazer gem dashboard.</div>
</div>
<h3 id="fourth-iteration-adding-an-installation-script">Fourth iteration: Adding an installation script</h3>
<p>So far I was playing by ear; I knew more or less how to reinstall the project on a new device, but I knew that after some time my memory would fade and the process would become a painstaking trial-and-error mess.</p>
<p>I first compiled all the steps needed for this to work in the repo’s README – it took a total of <a href="https://github.com/lbrito1/android-analytics/commit/9487a54b37c727bdd60b7276469fc58a8fd0d47d#diff-04c6e90faac2675aa89e2176d2eec7d8">17 steps</a> to get things running. Noticing that most of these steps could be automated, I wrote a <a href="https://github.com/lbrito1/android-analytics/blob/master/bin/setup.sh">setup script</a> that should do most of the work. I tested it in a separate Android device to make sure it works – hopefully it works for other people as well.</p>
<h3 id="final-architecture">Final architecture</h3>
<p>When someone accesses one of my tracked pages, this is roughly what happens:</p>
<ol>
<li>JavaScript on that page calls my domain (provided for free by <a href="https://www.duckdns.org/">DuckDNS</a>);</li>
<li>DuckDNS translates that address to my router’s most recent IP;</li>
<li>My router receives that request and uses the NAT table to redirect it to my Android phone;</li>
<li>On Android, Nginx receives the request and either logs it if the request comes from the right place (my list of tracked pages), or does nothing otherwise;</li>
<li>A scheduled Cron job rotates Nginx logs and converts the “old” log into rows in a Postgresql table;</li>
<li>I open <code><my-android-local-ip>:3000</code> on my desktop’s browser and view the charts, maps etc.</li>
</ol>
<p>This diagram shows those same steps, more or less:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/diagram.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/diagram.png" alt="android-analytics diagram.">
<noscript>
<img src="/blog/assets/images/2020/diagram.png" alt="android-analytics diagram.">
</noscript>
</a>
</div>
<div class="image-caption">android-analytics diagram.</div>
</div>
<p>I named the too (quite unimaginatively) android-analytics; code and set-up instructions are <a href="https://github.com/lbrito1/android-analytics">available on Github</a>.</p>
<h3 id="august-2021-update">August 2021 Update</h3>
<p>I managed to install Grafana on Termux by using <a href="https://f-droid.org/en/packages/exa.lnx.a/">AnLinux</a>; thus, the Viewer part of the project is no longer needed.</p>
<p>Also, by using <a href="https://ngrok.com/">Ngrok</a> (free tier), the project now works if you’re behind CGNAT, which is my case. No need for dynamic DNS or port forwarding as well.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I used the Google Analytics analogy because that’s the tool that most people are familiar with, and most people will immediately understand what this thing is about, which probably wouldn’t happen if instead of saying this was a “simple Google Analytics alternative”, I said it was a “log-based web analytics tool”.</p>
<p>But saying this is a “Google Analytics replacement” is like saying that a bicycle is a replacement for a truck. Although they are both transportation modes, they’re different in every other aspect. The thing is: sometimes you really need a truck, but a lot of times you just need to get from point A to point B, and a bike is more than enough. In fact, it is probably <em>better</em>: it is cheaper, easier to park and carry around, and has a smaller environmental footprint. This project is a bike: for some people, that’s all they will need.</p>
<p>There’s absolutely no need to use a mammoth like Google Analytics for a personal blog or pet project. Its more than wasteful – you’re offering free data to Google in exchange for a fancy dashboard so you can play I’m-SEO-master-at-Adcorp-LLC. Someone has to keep the data, of course, but I’d argue that a decentralized approach is much safer and probably more ethical than data monopoly by a single huge advertising company.</p>
<p>So what are the alternatives? There are a few competitors – we already discussed that in a <a href="#the-lack-of-competition">previous section</a>. But then we have all this processing power just lying around, free and unused; we might as well make better use of it. Smartphones have amazing processing, networking and storage capabilities, yet for many reasons they turn old very quickly, which translates to getting sold (in the best case); shoved into oblivion in our designated e-junk clutter drawer; or just discarded.</p>
<p>It is just sad that we have these tiny slabs of processing power that could <a href="https://www.realclearscience.com/articles/2019/07/02/your_mobile_phone_vs_apollo_11s_guidance_computer_111026.html">navigate Man to the Moon and back thousands of times over</a>, and we can’t seem to quite find any better occupation for them other than sitting in a dusty drawer for years or getting trashed. That is why even if it takes a little extra effort, I’d rather repurpose and reuse something I already own than subscribe to the fanciest new PaaS.</p>
tag:lbrito1.github.io,2020-06-27:/blog/2020/06/free_https_home_server.htmlSetting up a free HTTPS home server2020-06-27T22:48:21Z2020-06-27T22:48:21Z<div class="image-box stretch">
<div>
<a href="/2020/06/free_https_home_server.html">
<img class="lazy" data-src="/blog/assets/images/2020/cool-background.png" alt="">
<noscript>
<img src="/blog/assets/images/2020/cool-background.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Try searching for “free dynamic dns https”, “free domain with SSL” or anything similar. There won’t be a lot of meaningful results. Sure, some of the results are pretty close, like <a href="https://www.freecodecamp.org/news/free-https-c051ca570324/">this guide</a> on how to get free SSL certification from Cloudflare, or <a href="https://medium.com/@jeremygale/how-to-set-up-a-free-dynamic-hostname-with-ssl-cert-using-google-domains-58929fdfbb7a">this one</a> on setting up a free dynamic hostname with SSL, but they all assume you <em>already own a domain</em>. If you’re looking for a completely free domain that you can use for your personal web server that also has verified SSL, there are very few results.</p>
<p>But why was I even looking for this?</p>
<p>I’m working on a side project. It has a web server that communicates with a static web page hosted on Github Pages. There are a lot of ways of setting that up; in my particular case, I have a local (as in in my house) HTTP web server accepting traffic on a non-standard port (port <code>80</code> is blocked by my ISP <a href="https://www.reddit.com/r/InternetBrasil/comments/e9v5o0/abertura_das_portas_80_e_443_na_claronet/">for commercial reasons</a> – this detail is of paramount importance, but more on that later). It is accessible through my external IP (which is dynamic), which can be mapped to a dynamic DNS domain.</p>
<p>I wanted to run a simple API on the web server and access it through static pages (like this blog) hosted on Github Pages (which has a verified SSL certificate). <a href="https://stackoverflow.com/questions/62378047/is-it-possible-to-make-a-cross-domain-javascript-request-to-http-from-https">I asked the Internet</a> if it is possible to call from a SSL-verified page (in JavaScript) a different server that does not have a verified SSL certificate (that is, my aforementioned webapp running in my home server). It isn’t, so the conclusion was that I needed somehow to get a verified SSL certificate for my dynamic DNS domain.</p>
<p>Having no idea whether this was possible, I started to research.</p>
<!-- more -->
<h2 id="setting-up-dynamic-dns">Setting up Dynamic DNS</h2>
<p>Most ISPs provide dynamic IP addresses for their residential customers, while static IP addresses are usually reserved to the “commercial” or “business” tier. That means your public IP address changes (usually every <a href="https://vicimediainc.com/often-ip-addresses-change/">14 days</a>), so DNS servers will have to keep track of your changing IP somehow. That kind of service is called Dynamic DNS, or DDNS for short.</p>
<p><a href="https://free-for.dev/#/?id=dns">Several companies</a> provide DDNS service for free. Some of them also provide a free subdomain, which is useful if you don’t own a domain yourself (I don’t). I’ve tried out most of the free DDNS providers, the most prominent seeming to be Hurricane Electric, No-ip, Dynu and DuckDNS. If you’re up for it there are even several blog posts out there explaining <a href="https://blog.heckel.io/2016/12/31/your-own-dynamic-dns-server-powerdns-mysql/">how to set up your own dynamic DNS server</a>.</p>
<p>I wasn’t feeling too adventurous so I decided to set up shop with DuckDNS. It is really easy to set up, comes with a great HTTP API for updating the domain’s TXT, provides free subdomains that don’t expire (No-ip for instance has subdomains that expire after 30 days), and has a valid SSL certificate. They have a page <a href="https://www.duckdns.org/install.jsp">explaining how to set up the actual DDNS service</a>, so I’ll skip that.</p>
<h3 id="caveat-carrier-grade-nat">Caveat: carrier-grade NAT</h3>
<p>One big potential problem in the DDNS setup is whether you’re behind a <a href="https://www.wikiwand.com/en/Carrier-grade_NAT">carrier-grade NAT (CGNAT)</a>, which some ISPs unfortunately do. In short, being in a CGNAT boils down to not having a public IP address – you’re part of your ISP’s private network, and your router’s “public” IP address is actually a private IP address within that private network, which the ISP translates to and from the Internet.</p>
<p>CGNATs suck, and it essentially <a href="https://www.reddit.com/r/HomeNetworking/comments/6ahcp6/rtn66u_isp_changed_to_cgnat_broke_ddns/">makes using DDNS impossible</a>. You can find out if you’re behind a CGNAT by comparing your WAN IP address (displayed in the router admin page) and your public IP. If they differ, you’re probably behind a CGNAT</p>
<h2 id="setting-up-a-verified-ssl">Setting up a verified SSL</h2>
<p>I had set up the dynamic DNS service, and the next step was finding out if it was even possible to obtain a free valid SSL certificate for my subdomain.</p>
<p><a href="https://letsencrypt.org/getting-started/">Let’s Encrypt</a> provides free valid SSL certificates, which are usually obtained by using <a href="https://certbot.eff.org/">Certbot</a>, a handy software that will handle most of the complicated SSL verification process you. There are <a href="https://letsencrypt.org/docs/client-options/">several other</a> alternative tools that implement the same protocol used by Let’s Encrypt, but I really recommend using Certbot – it has much better out-of-the-box functionality than all the other tools I tried out, and the community is much bigger. The only caveat I could find is that you need <code>sudo</code> access to use it properly.</p>
<p>One thing I’d wish someone had told me before I spent hours looking for alternatives to Certbot is that <strong>it doesn’t have to be executed in the host that is ultimately going to obtain the SSL certificate</strong>. This might be a crucial bit of information if you can’t run as root on the actual host that will obtain the SSL certificate, which was my case. It is perfectly fine to run Certbot on a separate computer, obtain the SSL certificates and then <code>scp</code> them to the correct host.</p>
<p>Now, as I mentioned, my ISP blocks incoming traffic to port <code>80</code> for their residential customers. This is relevant because Let’s Encrypt uses by default the <strong>HTTP-01 challenge</strong> in the SSL verification process, and it requires the ports <code>80</code> and <code>443</code> to be open. However, LE also offers the alternative <strong><a href="https://letsencrypt.org/docs/challenge-types/">DNS-01 challenge</a></strong> which <strong>does not</strong> require those ports to be open (but requires the ability to update TXT domain records, which not all DDNS service providers allow – No-ip, for instance, does not). I happened to find out about this by reading <a href="https://www.splitbrain.org/blog/2017-08/10-homeassistant_duckdns_letsencrypt">this very helpful post</a> from someone in a similar predicament (home server, port <code>80</code> not available) saying he used this alternative challenge successfully with DuckDNS (thank you!). In <a href="https://serverfault.com/a/812038/578968">this Server Fault answer</a>, the poster explains how to use Certbot with the DNS-01 challenge (thank you!).</p>
<h3 id="running-certbot-with-dns-01-and-duckdns">Running Certbot with DNS-01 and DuckDNS</h3>
<p>DNS-01 works by confirming that you can modify the DNS TXT record of your domain.</p>
<p>Here’s the command to start SSL verification with Certbot using DNS-01 and a DuckDNS subdomain, and the expected output:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">sudo </span>certbot <span class="nt">-d</span> my-subdomain.duckdns.org <span class="nt">--manual</span> <span class="nt">--preferred-challenges</span> dns certonly
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator manual, Installer None
Obtaining a new certificate
Performing the following challenges:
dns-01 challenge <span class="k">for </span>my-subdomain.duckdns.org
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NOTE: The IP of this machine will be publicly logged as having requested this
certificate. If you<span class="s1">'re running certbot in manual mode on a machine that is not
your server, please ensure you'</span>re okay with that.
Are you OK with your IP being logged?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
<span class="o">(</span>Y<span class="o">)</span>es/<span class="o">(</span>N<span class="o">)</span>o: Y
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please deploy a DNS TXT record under the name
_acme-challenge.my-subdomain.duckdns.org with the following value:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Before continuing, verify the record is deployed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue</code></pre></div>
<p>At this point you have to do as the program says: update the DNS TXT record. Thankfully, this is exceedingly easy to do with DuckDNS (see their <a href="https://www.duckdns.org/spec.jsp">spec page</a> for instructions).</p>
<p>You can verify that the TXT was updated by running <code>dig</code>:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span>dig my-subdomain.duckdns.org TXT
<span class="p">;</span> <<<span class="o">>></span> DiG 9.11.3-1ubuntu1.12-Ubuntu <<<span class="o">>></span> my-subdomain.duckdns.org TXT
<span class="p">;;</span> global options: +cmd
<span class="p">;;</span> Got answer:
<span class="p">;;</span> ->>HEADER<span class="o"><<-</span> <span class="no">opcode</span><span class="sh">: QUERY, status: NOERROR, id: 21765
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;my-subdomain.duckdns.org. IN TXT
;; ANSWER SECTION:
my-subdomain.duckdns.org. 59 IN TXT "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
;; Query time: 335 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Mon Jun 15 18:50:41 -03 2020
;; MSG SIZE rcvd: 114</span></code></pre></div>
<p>Once you confirmed the TXT value, the remainder of Certbot’s output should be this success message:</p>
<div class="highlight"><pre><code class="language-bash">Waiting <span class="k">for </span>verification...
Cleaning up challenges
IMPORTANT NOTES:
- Congratulations! Your certificate and chain have been saved at:
/etc/letsencrypt/live/my-subdomain.duckdns.org/fullchain.pem
Your key file has been saved at:
/etc/letsencrypt/live/my-subdomain.duckdns.org/privkey.pem
Your cert will expire on 2020-09-13. To obtain a new or tweaked
version of this certificate <span class="k">in </span>the future, simply run certbot
again. To non-interactively renew <span class="k">*</span>all<span class="k">*</span> of your certificates, run
<span class="s2">"certbot renew"</span>
- If you like Certbot, please consider supporting our work by:
Donating to ISRG / Let<span class="s1">'s Encrypt: https://letsencrypt.org/donate
Donating to EFF: https://eff.org/donate-le</span></code></pre></div>
<p>All set! You now have a valid SSL certificate. You’ll still need to place it in the right place, which will vary depending on what web server you’re using. For example, if you’re using Nginx, the configuration file might look something like this:</p>
<div class="highlight"><pre><code>
server {
ssl_certificate /path/to/fullchain.pem;
ssl_certificate_key /path/to/privkey.pem;
...
}
</code></pre></div>
<h2 id="conclusion">Conclusion</h2>
<p>There’s quite a lot of shady-looking websites out there offering for a monthly fee the exact same thing as I just wrote about. When researching this, not knowing too much about most of these topics, I was almost fooled into accepting that this just couldn’t be done for free for some unknown technical reason. There <em>had</em> to be a reason why there were no Google results for this – maybe my case was too specific, or maybe other people are less cheap than I am and just pay for a domain and get the SSL stuff for free.</p>
<p>I still have no good explanation as to why the kind of guide I just wrote above didn’t show up in my research. Maybe people don’t care about home servers, or maybe I’m just not too good at searching (probably both). In any case, hopefully this post will make it clear that setting up a DDNS subdomain with SSL for free is not only possible, but really not that complicated.</p>
tag:lbrito1.github.io,2020-05-30:/blog/2020/05/communication.htmlCommunication tips for remote developers2020-05-30T15:16:00Z2020-05-30T15:16:00Z<div class="image-box stretch">
<div>
<a href="/2020/05/communication.html">
<img class="lazy" data-src="/blog/assets/images/2020/bridge.jpg" alt="San Francisco Bay bridge.">
<noscript>
<img src="/blog/assets/images/2020/bridge.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">We're all remote -- for now.</div>
</div>
<p>Communicating well with your co-workers and managers is supremely important to a software developer, and even more so for the remote one. With a lot more remote workers due to the COVID-19 pandemic, this topic became a lot more relevant.</p>
<p>I’ve seen people hint at this more than a few times over the years, but I didn’t really “get it” until I started working as a fully remote engineer. I also find it important to understand not only <em>what</em> we should be doing to achieve efficient communication, but also <em>why</em> we should be doing those things in those ways.</p>
<p>To me, the single most important thing to keep in mind is that people’s mental resources: time, attention span, etc, like yours, are limited.</p>
<!-- more -->
<p>That might be obvious, but different management styles might make it seem otherwise – a more hands-on manager might kind of seem like he is acutely aware of what you’re doing all the time, but that is hardly ever the case. Managing styles aside, managers are, after all, humans like the rest of us and have limited time and resources. They can’t possibly have the same insight into each task as the respective engineers working on them.</p>
<p>The corollary to that is that it is your job to keep managers in the loop, providing the right amount of information at the right time through the right channel (text, video call, presentation…). Just like any other skill, this is something you can learn over time.</p>
<p>Similar reasoning applies to co-workers: people are usually deeply involved in whatever it is they’re doing, so they won’t usually know too many details about what you or other co-workers are doing all the time.</p>
<p>A lot of the things you need to do are fairly obvious and well-known: be clear about what you’re saying, keep communicating frequently, etc. Other things aren’t too obvious (at least to me, that is) and are worth sharing in this short blog post.</p>
<h2 id="dont-write-a-novel">Don’t write a novel</h2>
<p>Reading is hard. People’s availability and attention span vary. Try to get your point across with the least words as possible.</p>
<p>If I’m writing an issue update, pull request, or other technical information, I usually start with a more winding text and then prune as much of it as I can. This can be really simple, like changing “According to #85748, the problem I described started when…” to “The problem began when … (#85748)”.</p>
<p>This can’t be done at the expense of clarity, though. It is preferable to write or say “I think option B is the way to go” than an ambiguous “sounds good”.</p>
<h2 id="manage-expectations">Manage expectations</h2>
<p>People don’t like to feel disappointed. To avoid unfulfilled expectations, it is your job to make sure those expectations stay realistic – the more so when the task evolves or unravels into something much more complicated than people originally expected.</p>
<p>As we all know, there’s a lot of uncertainty in this job. Something that seems easy might actually be super easy or might be very hard. Unexpected difficulties are expected, and most people are fine with that <strong>as long as they also think that those problems are actually problems.</strong></p>
<p>That’s a huge caveat: if you’re unable to convince other people about the seriousness of the unexpected problems you’re facing, you might as well not say anything at all about them. People usually only believe what they understand, and it is your job to properly communicate that to people less involved in the task than you are.</p>
<h2 id="tailor-to-the-audience">Tailor to the audience</h2>
<p>Different people use different lingo to express the same things. Sales people will use different terms than engineering people.</p>
<p>Adjusting your language to the audience isn’t just about replacing technical words with other words, though, but also about cropping the information in the right way.</p>
<p>Excess information that isn’t relevant to the point you want to get across generates noise and confusion. This might be seen as a broader definition of the first topic: if you can manage to get your point across with less information (whatever that is: spreadsheets, images, etc), then that is certainly desirable. If the person you’re talking or presenting to is only interested in 1 of the 3 columns of a spreadsheet, although the other 2 might be insightful to you, you should probably refrain from showing them at that moment.</p>
<p>And that is the note I’m ending this post on. These three things are probably obvious or second nature to a lot of people, but at least to me, it took a few years of remote work to fully appreciate them. Hopefully this post can be helpful to other like-minded developers.</p>
tag:lbrito1.github.io,2020-05-16:/blog/2020/05/figuring_out_nvidia_x_linux.htmlFiguring out the Nvidia x Linux puzzle2020-05-16T19:48:00Z2020-05-16T19:48:00Z<div class="image-box stretch">
<div>
<a href="/2020/05/figuring_out_nvidia_x_linux.html">
<img class="lazy" data-src="/blog/assets/images/2020/power.png" alt="Ubuntu power consumption chart.">
<noscript>
<img src="/blog/assets/images/2020/power.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">Ubuntu's power rate over time.</div>
</div>
<p>I’ve struggled with some kind of problem with Nvidia graphics cards on Linux since forever.</p>
<p>Most commonly, an external monitor wouldn’t work or the dedicated card would refuse to power off when it should.</p>
<p>The latter problem – a power-hogging discrete Nvidia card not turning off when it isn’t needed, specifically in <a href="https://www.wikiwand.com/en/Nvidia_Optimus">Optimus</a>-enabled laptops – has consistently haunted me throughout the years. At least in my experience, this problem is in that sweet spot of things that are definitively annoying and kind of inconvenient, but complicated enough not to be worth the several work-hours needed to definitively solve it.</p>
<!-- more -->
<p>I know that I’m not alone here, as other people over the internet have said things like <em><a href="https://forum.manjaro.org/t/solved-bumblebee-issues-with-bbswitch/70137">“I’ve been pulling my hair out for the past few hours trying to configure my graphics drivers on my laptop”</a></em>. I’ve also not been a total sloth about this: although I have tried many times in the past to fix this, I’ve consistently found myself thinking “okay, <em>now</em> this is fixed”, only to a few hours/days later notice that my laptop battery was drained in an hour and the problem was back. I actually re-wrote a significant part of this post because when I thought I was finished, my Nvidia card started turning on again and I had to do more research.</p>
<p>Taking advantage of the extra time in my hands due to the Covid-19 city-wide lockdown, I decided to persistently look for a solution to this issue. This guide is just a documentation of this process. I use Ubuntu, but similar steps should be valid with whatever distro you’re using. Also, some or many of the steps might not actually be necessary - they’re just what happened to finally work in my case.</p>
<h3 id="install-the-proprietary-nvidia-drivers">1. Install the proprietary Nvidia drivers</h3>
<p>Ubuntu uses the open-source Nouveau driver for Nvidia cards, which doesn’t play well with Optimus-enabled laptops. Let’s install the proprietary Nvidia driver.</p>
<p>First, find out what’s the recommended Nvidia driver:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span>ubuntu-drivers devices
<span class="o">==</span> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 <span class="o">==</span>
modalias : pci:v000010DEd00002191sv00001462sd00001274bc03sc00i00
vendor : NVIDIA Corporation
driver : nvidia-driver-435 - distro non-free
driver : nvidia-driver-440 - distro non-free recommended
driver : xserver-xorg-video-nouveau - distro free <span class="nb">builtin</span>
<span class="o">==</span> /sys/devices/pci0000:00/0000:00:14.3 <span class="o">==</span>
modalias : pci:v00008086d0000A370sv00008086sd00000034bc02sc80i00
vendor : Intel Corporation
manual_install: True
driver : backport-iwlwifi-dkms - distro free</code></pre></div>
<p>Then install it:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">sudo </span>apt <span class="nb">install </span>nvidia-440</code></pre></div>
<p>Another option is to pick the driver in the Additional Drivers tab of the <code>Softwares & Updates</code> tool:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/2020-05-16-05-04.nvidia.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/2020-05-16-05-04.nvidia.png" alt="Nvidia proprietary driver option in Ubuntu's Additional Drivers menu.">
<noscript>
<img src="/blog/assets/images/2020/2020-05-16-05-04.nvidia.png" alt="Nvidia proprietary driver option in Ubuntu's Additional Drivers menu.">
</noscript>
</a>
</div>
<div class="image-caption">Nvidia proprietary driver option in Ubuntu's Additional Drivers menu.</div>
</div>
<p>Nvidia’s proprietary driver lets you choose if you want to use the dedicated or integrated GPU, which you can try setting:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/nvidia-setting.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/nvidia-setting.png" alt="Nvidia proprietary driver's GPU selection menu.">
<noscript>
<img src="/blog/assets/images/2020/nvidia-setting.png" alt="Nvidia proprietary driver's GPU selection menu.">
</noscript>
</a>
</div>
<div class="image-caption">Nvidia proprietary driver's GPU selection menu.</div>
</div>
<p>Now if you’re lucky this might be enough. Check the power usag using Ubuntu’s <code>Power Statistics</code> tool or <code>powertop</code>: if the Nvidia card is successfully turned off, then typical power usage is somewhere between 8-14W. If, like me, this changed nothing in your power usage, read on.</p>
<h3 id="install-and-configure-bbswitch">2. Install and configure bbswitch</h3>
<p>Although Nvidia’s proprietary driver allows selecting between integrated and dedicated cards, in my experience that setting has had no effect at all, with both cards always being powered on anyway.</p>
<p><code>bbswitch</code> is a tool that allows you to select which card you want your system to use. Ubuntu has the bbswitch-dkms package available:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">sudo </span>apt <span class="nb">install </span>bbswitch-dkms</code></pre></div>
<p>Then configure it to always turn off the discrete card by creating the following file:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">cat</span> /etc/modprobe.d/bbswitch.conf
options bbswitch <span class="nv">load_state</span><span class="o">=</span>0</code></pre></div>
<h3 id="blacklist-nouveau-driver">3. Blacklist Nouveau driver</h3>
<p>According to <a href="https://askubuntu.com/a/1044095/463850">this Stackoverflow answer</a>, there seem to be at least a couple of bugs that result in Ubuntu trying to load the Nouveau module even if you’re using a proprietary Nvidia driver. When that happens, the discrete Nvidia GPU turns on and starts hogging a lot of power.</p>
<p>Blacklisting the Nouveau module solved this issue for me:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">sudo </span>bash <span class="nt">-c</span> <span class="s2">"echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"</span>
<span class="nv">$ </span><span class="nb">sudo </span>bash <span class="nt">-c</span> <span class="s2">"echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"</span></code></pre></div>
<p>Restart and confirm that the right driver is loaded:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span>gpu-manager | <span class="nb">grep </span>nouveau
Is nouveau loaded? no
Is nouveau blacklisted? <span class="nb">yes</span></code></pre></div>
<h3 id="blacklist-some-nvidia-modules">4. Blacklist some Nvidia modules</h3>
<p>Even after the above, my system kept turning on the nvidia card seemingly at random. I found <a href="https://github.com/Bumblebee-Project/Bumblebee/issues/951">this post</a> in the Bumblebee issue tracker to be extremely helpful:</p>
<blockquote>
<p>“bumblebee can turn the nvidia card off when it starts, but as soon as the nvidia module is loaded, it loads nvidia_drm, which links to drm_kms_helper and then bumblebee can’t remove the nvidia modules. This means that bumblebee can’t turn off the nvidia card when optirun terminates. To fix this, I added “alias nvidia_drm off” and “alias nvidia_modeset off” to my conf file in /etc/modprobe.d.”</p>
</blockquote>
<p>This is the file created by the OP:</p>
<div class="highlight"><pre><code class="language-bash"><span class="nv">$ </span><span class="nb">cat</span> /etc/modprobe.d/nvidia.conf
blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_modeset
<span class="nb">alias </span>nvidia_drm off
<span class="nb">alias </span>nvidia_modeset off</code></pre></div>
<p>After creating this file and restarting, my system was finally using only the Intel integrated card. Hopefully this time it’ll stay that way.</p>
<h3 id="results">Results</h3>
<p>Here’s a chart of my laptop’s power rate:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/power.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/power.png" alt="Ubuntu power consumption chart.">
<noscript>
<img src="/blog/assets/images/2020/power.png" alt="Ubuntu power consumption chart.">
</noscript>
</a>
</div>
<div class="image-caption">Ubuntu's power rate over time.</div>
</div>
<p>Using the integrated Intel GPU, the rate fluctuates around 10W. When the Nvidia card kicks in, which is what was going on around the middle of the chart, it jumps to 40W+.</p>
<h3 id="references">References</h3>
<ul>
<li><a href="">https://linuxconfig.org/how-to-install-the-nvidia-drivers-on-ubuntu-18-04-bionic-beaver-linux</a></li>
<li><a href="">https://github.com/Bumblebee-Project/bbswitch</a></li>
<li><a href="">https://github.com/Bumblebee-Project/Bumblebee/issues/951</a></li>
<li><a href="">https://turlucode.com/optimus-bbswitch-on-ubuntu-18-04/</a></li>
</ul>
tag:lbrito1.github.io,2020-02-05:/blog/2020/02/repurposing-android.htmlRepurposing an old Android phone as a Ruby web server2020-02-05T12:24:41Z2020-02-05T12:24:41Z<div class="image-box stretch">
<div>
<a href="/2020/02/repurposing-android.html">
<img class="lazy" data-src="/blog/assets/images/2020/old-android.jpg" alt="Old smartphones on a desk.">
<noscript>
<img src="/blog/assets/images/2020/old-android.jpg" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption">CC-BY Carlos Varela, https://www.flickr.com/photos/c32/7755470064</div>
</div>
<p>Do you have an old Android phone? Sure you do! There’s a mind-blowing amount of electronic waste of all kinds, and with the average person in developed countries <a href="https://www.cnbc.com/2019/05/17/smartphone-users-are-waiting-longer-before-upgrading-heres-why.html">discarding their phones every couple of years</a>, discarded smartphones are probably one of the most common forms of e-waste.</p>
<p>I had an old Motorola G5 Cedric gathering dust, so I decided to do something with it – it is now running a Puma web server with a simple Sinatra webapp.</p>
<p>Now, before going any further, you might be thinking: what is the real, practical use of all this? An old Android phone probably isn’t going to have a stellar performance, but neither do those <code>t2.nano</code>s, honestly. I’m yet to deploy any “real” code on an Android, but even the cheaper and older phones do commonly have quad-core or even octa-core CPUs, and at least 2GB RAM, so at least in theory a phone <em>should</em> be close – ballpark, at least – to the most modest cloud IaaS offers our there (<code>t2.nano</code> has 512MB for instance). Of course, a phone has an ARM processor while IaaS usually are x86; memory management is entirely different as well, but still – we’re talking ballpark estimates here.</p>
<p>Anyway, this is a short tutorial on how to repurpose an Android device as a web server – or any number of different things, really.</p>
<!-- more -->
<h2 id="install-termux">1. Install Termux</h2>
<p>First of all we need a Linux environment in our phone. Termux is a terminal emulator and Linux environment for Android. It’s available on Google Play Store. No additional configuration is needed after installation.</p>
<h2 id="set-up-ssh">2. Set up SSH</h2>
<p>You won’t want to type a lot of commands into a tiny touchscreen, so let’s set up ssh so that we can log into Termux remotely.</p>
<p>There are <a href="https://wiki.termux.com/wiki/Remote_Access">several ways</a> of doing this, but I’ve found that the easiest way is through a software called <strong>Dropbear</strong>:</p>
<p><strong>Run this on Android:</strong></p>
<div class="highlight"><pre><code class="language-bash">pkg upgrade
pkg <span class="nb">install </span>dropbear</code></pre></div>
<p>You can use password-based authentication or public key authentication. You should use key-based authentication, but for testing purposes password-based is easiest. Run this on Android:</p>
<p><strong>Run this on Android:</strong></p>
<div class="highlight"><pre><code class="language-bash">passwd
New password:
Retype new password:
New password was successfully set.</code></pre></div>
<p><strong>Bonus points:</strong> install a terminal multiplexer like <code>tmux</code> or <code>screen</code>. This will make your life much easier when running stuff via ssh:</p>
<div class="highlight"><pre><code class="language-bash">pkg <span class="nb">install </span>tmux</code></pre></div>
<p>Now go ahead and test the connection on your desktop:</p>
<div class="highlight"><pre><code class="language-bash">ssh android-ip-address <span class="nt">-p</span> 8022</code></pre></div>
<h2 id="set-up-static-ip-address-on-android">3. Set up static IP address on Android</h2>
<p>Go to wifi settings, disable DHCP and assign an IP address for your phone.</p>
<p>This is necessary so that your router won’t assign a new IP address to your phone every few hours/days, which would make configuration a lot harder.</p>
<h2 id="install-ruby-bundler-sinatra-and-puma">4. Install Ruby, Bundler, Sinatra and Puma</h2>
<p>Sinatra is a lightweight web application framework, and Puma is a web server.</p>
<p>Ruby is, well Ruby!</p>
<p>Of course, Sinatra and Puma are just suggestions – you could even use full-blown Rails on your phone, as described in <a href="https://mbobin.me/ruby/2017/02/25/ruby-on-rails-on-android.html">this neat tutorial</a>. Just <a href="https://devcenter.heroku.com/articles/ruby-default-web-server#why-not-webrick">don’t use WEBRick</a>, the default Rails web server in development – it is single-process, single-threaded and thus not suitable for production environments (it is fine for small experiments though).</p>
<p><strong>Run this on Android:</strong></p>
<div class="highlight"><pre><code class="language-bash">pkg <span class="nb">install </span>ruby
gem <span class="nb">install </span>sinatra puma</code></pre></div>
<h2 id="install-nginx">Install nginx</h2>
<p>nginx is a web server, reverse-proxy and load balancer. Although most useful in multi-server setups where it is used to distribute requests among different instances, nginx is also a good idea in our setup because of the embedded DDoS protection and static file serving that it provides.</p>
<p><strong>Run this on Android:</strong></p>
<div class="highlight"><pre><code class="language-bash">pkg <span class="nb">install </span>nginx</code></pre></div>
<p>Now the slightly tricky part is configuring nginx to work with Puma. <a href="https://gist.github.com/ctalkington/4448153">This gist</a> is a pretty good start – copy & paste <code>nginx.conf</code> and change <code>appdir</code> to your webapp’s root dir. In my case, for example, that would be <code>/data/data/com.termux/files/home/android-sinatra</code>.</p>
<h2 id="set-up-port-forwarding">Set up port forwarding</h2>
<p>You probably want your web server to be accessible through the internet, so you’ll have to set up port forwarding in your router to redirect incoming requests to your public IP address to your brand new Android web server.</p>
<p>How exactly to do this varies depending on your router. <a href="https://www.noip.com/support/knowledgebase/general-port-forwarding-guide/">Here’s</a> a pretty good tutorial to get you started.</p>
<h2 id="configure-a-dynamic-dns">Configure a dynamic dns</h2>
<p>Most people have dynamic public IP addresses. In these cases it is useful to set up a dynamic dns (DDNS) service, which provides you with a static domain name that redirects automatically to whatever your public IP address is at that moment.</p>
<p>There are few free services that provide DDNS nowadays; I’m using <a href="https://www.noip.com/">no-ip</a> and it has been okay so far. You do have to “renew” your domain every month though.</p>
<p>After setting up a DDNS, you’ll have to configure your router as well so that it periodically notifies the DDNS service with your current IP address. Again, how exactly to do this depends on your router model.</p>
<h2 id="hello-world">Hello world!</h2>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/android-web-server.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/android-web-server.jpg" alt="Puma and nginx running on a Motorola G5.">
<noscript>
<img src="/blog/assets/images/2020/android-web-server.jpg" alt="Puma and nginx running on a Motorola G5.">
</noscript>
</a>
</div>
<div class="image-caption">Puma and nginx running on a Motorola G5.</div>
</div>
<h2 id="under-siege">Under siege</h2>
<p>You can simulate real-world usage through <a href="https://www.joedog.org/siege-home/"><code>siege</code></a>, a http load testing software. Here’s a screenshot of <code>siege</code> running on my setup with 3 concurrent users (real tests would use bigger numbers):</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/siege.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/siege.jpg" alt="Screenshot of siege running on a terminal.">
<noscript>
<img src="/blog/assets/images/2020/siege.jpg" alt="Screenshot of siege running on a terminal.">
</noscript>
</a>
</div>
<div class="image-caption">siege running in the foreground; nginx logs and top on remote (android) running in the background terminals.</div>
</div>
<p>The numbers in that screenshot don’t matter much because our webapp was serving a simple 100-char response with a timestamp, but it is enough to at least know that the server can handle a few concurrent users.</p>
<h2 id="epilogue-safety">Epilogue: safety</h2>
<p>If you’ve watched <a href="https://en.wikipedia.org/wiki/Mr._Robot">Mr Robot</a>, you know that the internet can be a dangerous place. That is a lot more true if you have a web server open to the internet.</p>
<p>Within a few hours of opening up the server, it was already being crawled by all sorts of things. Most are innocuous indexing robots, but some are definitively not so nice, like these two requests:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2020/scanners.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2020/scanners.jpg" alt="nginx logs showing port scanning attacks.">
<noscript>
<img src="/blog/assets/images/2020/scanners.jpg" alt="nginx logs showing port scanning attacks.">
</noscript>
</a>
</div>
<div class="image-caption">Most of those requests seem fine, but the two in red are probably some kind of attack.</div>
</div>
<p>So the headline here is: keep all software updated, keep an eye on access logs and maybe go through nginx safety guides such as <a href="https://www.cyberciti.biz/tips/linux-unix-bsd-nginx-webserver-security.html">this</a> and <a href="https://geekflare.com/nginx-webserver-security-hardening-guide/">this</a>.</p>
tag:lbrito1.github.io,2019-11-06:/blog/2019/11/speeding-up-backend-graph-theory.htmlSpeeding Up the Backend with Graph Theory2019-11-06T22:29:15Z2019-11-06T22:29:15Z
<div class="image-box stretch">
<div>
<a href="/2019/11/speeding-up-backend-graph-theory.html">
<img class="lazy" data-src="/blog/assets/images/2019/0076-final-results.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0076-final-results.png" alt="Alternative text to describe image.">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Here at Sensor Tower we handle large volumes of data, so to keep things snappy for our customers we need to think carefully about how we process and serve that data.</p>
<p>Understanding the data we’re handling is a fundamental part of improving the way we serve it, and by analyzing how an important backend service worked, we were able to speed it up by a factor of four.</p>
<!-- more -->
<p><em>This post was originally posted in the <a href="https://sensortower.com/blog/speeding-up-the-backend-with-graph-theory">Sensor Tower blog</a>.</em></p>
<h2 id="background">Background</h2>
<p>We have many user-facing endpoints in Sensor Tower. So many, in fact, that we have numerous dashboards to keep tabs on how the system behaves.</p>
<p>A few months ago, we noticed that a particular and very important endpoint was very sluggish: While all the other endpoints of the same type had <50ms latencies, this particular service took a leisurely 300 to 500ms to respond. Here’s a diagram of how that looked:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0070-sadface-backend.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0070-sadface-backend.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0070-sadface-backend.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>The customer doesn’t look very happy up there! So, we decided to take some time and do an in-depth analysis of that endpoint.</p>
<p>Okay, now to a more serious diagram. Here’s what that endpoint looked like:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0070-diagram.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0070-diagram.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0070-diagram.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>The numbered steps in the diagram above perform the following operations:
1. Decode a Protobuf string and build a Ruby object from it;
2. Modify the object;
3. Encode the Ruby object back to Protobuf.</p>
<p>In essence, the endpoint receives Protobuf-encoded strings, does some work on them, and returns a processed version of the Protobuf-encoded string to the client. If you don’t know what Protobuf is, that’s okay; I didn’t either. You can think of it as something similar to JSON: A serialized tree structure encoded using a binary format rather than text.<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote">1</a></sup></p>
<p>Once we pinpointed that the endpoint slowness was due to Protobuf parsing, the next step was to try and find bottlenecks in the algorithm. The proper way to do this (which is not to just to read the code thoroughly) is by using a profiler.</p>
<p>With the help of <a href="https://ruby-prof.github.io">ruby-prof</a> (generate profile data) and <a href="https://kcachegrind.github.io">KCacheGrind</a> (view profile data), we were able to identify two methods, <code>#find_all</code> and <code>#encode</code>, that took a large portion of the CPU time:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0070-profiler.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0070-profiler.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0070-profiler.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>While a profiler is a useful tool to identify potential problems, it has its limitations. Profiling helps you visualize data from <em>a single run</em> through your code. That might be fine if you’re analyzing a simple algorithm that deals with very homogeneous input, but is not really enough in the case of our back-end service, which receives thousands of very different inputs every hour.</p>
<p>In other words, we also needed to validate the profiler results with more data.</p>
<p>Taking this understanding into account, we opted for benchmarking a few thousand requests. Specifically, we benchmarked the <code>#find_all</code> and <code>#encode</code> methods and found out that, while the time they consumed relative to the total time varied, the sum of those two methods took almost 100 percent of the total time of the entire endpoint. At this point we knew we could focus our attention on these two methods.</p>
<p>Naturally, the first step was to understand what each of those methods did.</p>
<p><code>#find_all</code> is responsible for decoding and modifying the object (steps one and two in the diagram), while <code>#encode</code> is exactly what the name implies: It re-encodes the modified Ruby object back to Protobuf (step three).</p>
<p>With that said, let’s go through the optimizations performed in both of these methods.</p>
<h2 id="first-optimization-encode">First Optimization: Encode</h2>
<p>Before we dive into the first optimization, let’s first explain how exactly the encoding/decoding processing works. Here’s an overview of what they do:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0071-decode-encode.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0071-decode-encode.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0071-decode-encode.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption">*Those Strings aren't really Protobuf -- because it is a binary encoding, it isn't too easy on the eyes, so for the sake of readability we're using this pseudo-JSON representation.</div>
</div>
<p>There are two things we need to point out about this decoding/encoding process that might not be obvious:</p>
<ol>
<li>Both encoding and decoding work recursively, starting at the root and finding their way down to the leaves;</li>
<li>Each node in the Ruby object also contains the original Protobuf string for that node. So for instance the node <code>C</code> in the example above also contains the following string: <code>"{ C: { B: [A]}, F: [D, X1] }"</code>; the node <code>B</code> contains this other string: <code>"{ B: [A] }"</code>, and so on.</li>
</ol>
<p>Now we’re ready to understand the actual optimization.</p>
<p>Let’s take a detailed look at the modification process:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0072-decode-encode-detail.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0072-decode-encode-detail.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0072-decode-encode-detail.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>One of the most fundamental steps of any optimization is avoiding repeated work. If we look closely, we can see that the nodes in blue (A, B, D) <em>were not modified</em>: Look at the strings generated by decode (left side, in yellow) and compare them with the ones generated by encode (blue, to the right)—they’re identical! Conversely, nodes in red (C, F) were indeed modified: The strings are different. So, now we know there is some potentially repeated work going on.</p>
<p>The first optimization leveraged this repeated work. Instead of always encoding every single node, we now encode <em>only those nodes that were modified</em>. All the rest of the nodes already have a valid Protobuf string stored as an instance variable, and that string is identical to what we would obtain if we were to run <code>#encode</code> on them.</p>
<p>The actual code change to implement this was quite simple: Just a matter of adding a <code>dirty</code> flag to each node, and marking the node as <code>@dirty = true</code> if it or one of its descendants was modified.</p>
<p>This optimization alone reduced the endpoint’s execution time by 30 percent. Here’s the execution time chart right after deploying the optimization:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0073-result-optim-1.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0073-result-optim-1.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0073-result-optim-1.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<h2 id="second-optimization-finding-a-node">Second Optimization: Finding a Node</h2>
<p>The first optimization worked on the <code>#encode</code> method, so the natural next step was to look at the other time-consuming method, <code>#find_all</code>.</p>
<p>As we briefly mentioned, <code>#find_all</code> is responsible for two things: Decoding the Protobuf string into a Ruby object and modifying the object itself.</p>
<p>Unfortunately, there is no way of knowing beforehand if we’ll need to modify anything or not, so we’ll always have to do the decoding step. But what about the other thing <code>#find_all</code> does, modifying the object?</p>
<p>Before diving in, let’s recall a few things:
1. Protobuf is a tree-based data structure;
2. The trees we receive have no internal order to take advantage of;
3. Our algorithm searches for specific nodes and removes them from the tree;
4. We don’t know what the trees look like beforehand.</p>
<p>Before this optimization, <code>#find_all</code> was running a simple tree traversal to try and find those specific nodes mentioned in step three above. This is an acceptable approach when your input is small or when you’re not too worried about response time, but when you have massive inputs and want to deliver the smallest possible runtime, tree traversals can be a problem: They have linear time complexity (<code>O(n)</code>, where n is the number of nodes).</p>
<p>Once we know the path to a node, though, accessing it is very cheap: It can be done in logarithmic time, <code>O(log n)</code>. This is possible because of a mathematical property of trees: Tree height is roughly a logarithmic function of the amount of nodes (it might degenerate into a linear function as well, but let’s leave those explanations to the textbooks), thus the average-case maximum path length to a node (that is, from the root down to the deepest leaf) is also bound to that same logarithmic constraint.</p>
<p>So, we started looking closely to which paths we were going through to access those few nodes we wanted to remove. Ideally, there would be a single, universal path found in all the trees we ever encounter. That, way we could store that single path and always be guaranteed of finding the nodes we want to. Conversely, the worst possible outcome would be that every tree had a unique path to those nodes.</p>
<p>The truth lied somewhere in between those two extremes (thankfully for us, it leaned more towards the former rather than the latter). Here’s a chart of the amount of different paths we found over time (the two curves represent paths to two of the nodes we want to remove):</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0070-paths-log.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0070-paths-log.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0070-paths-log.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>Without going into too much detail about that chart, just notice that it is <em>very logarithmic!</em> This is excellent for us, because it means that with a relatively small amount of paths we can find a very large percentage of the total nodes we want to find (and for the few we don’t, not finding them is okay). The next chart compares what we actually found (logarithmically growing paths) with the worst possible scenario mentioned previously:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0070-worst-case.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0070-worst-case.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0070-worst-case.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>So, the second optimization was, in the end, also very simple: We simply collected a large enough amount of different paths and then traversed to them in logarithmic time instead of doing a full tree traversal that takes linear time to find the nodes we wanted to modify. This was responsible for a 300 percent speedup:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0075-result-optim-2.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0075-result-optim-2.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0075-result-optim-2.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<h2 id="caveats">Caveats</h2>
<p>You might have noticed that this method is not perfect in the sense that it doesn’t always find all the nodes that we would’ve found using a complete tree traversal. This is quite true: The optimization comes with an accuracy trade-off. While this might be a deal-breaker for systems where you actually need 100 percent accuracy at all times, this wasn’t really a problem for us; Missing a few nodes out of the several thousand we process each hour wasn’t really a big deal.</p>
<p>As time passes, however, and different trees keep coming in, the precision of this approach eventually declines to a level that is significant even for our not-too-strict requirements. This happens slowly, because of how different paths appear (following a logarithmic function), but surely—our accuracy was ever-descending because we had used a fixed number of paths.</p>
<p>After employing our optimized algorithm to the payload and responding to the request, we post-process a subset of the the requests in the background and dynamically update the path definitions. This way, we always have a very high success rate on the parsing but keep the latency of responding to a particular request low.</p>
<h2 id="final-results">Final Results</h2>
<p>Here’s a chart showing our execution time before and after we rolled out both optimizations:</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/2019/0076-final-results.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/2019/0076-final-results.png" alt="">
<noscript>
<img src="/blog/assets/images/2019/0076-final-results.png" alt="">
</noscript>
</a>
</div>
<div class="image-caption"></div>
</div>
<p>We effectively reduced execution time from 300-500ms to 80ms with almost no impact to the user.</p>
<h2 id="notes">Notes</h2>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Protobuf, or <a href="https://en.wikipedia.org/wiki/Protocol_Buffers">Protocol buffers</a>, is a binary serialization method developed by Google. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
tag:lbrito1.github.io,2019-08-30:/blog/2019/08/creating-more.htmlMy attempt at creating more2019-08-30T00:00:00Z2019-08-30T00:00:00Z<p>I began blogging in the now prehistoric late 2000s.</p>
<p>I’ve done a few blogs about different subjects (computer science, algorithms, web development, short stories and political ramblings). I’ve had blogs on Blogspot, Wordpress and, more recently, Medium.</p>
<p>Those platforms were (or are, I suppose) an easy way to spew your ideas over the Internet while also being nice and comfy for other people to actually read (this last point is important for the CSS-challenged such as yours truly). In other words, those services Got Shit Done™.</p>
<!-- more -->
<p>Alas, as I opened my eyes to the wonders of web development I started noticing a few things. First, Wordpress is written in PHP, which is gross (just kidding). Second, you don’t really control much: you can pick themes or whatever, but you won’t have the full control you’d have by creating a website from scratch or nearly scratch. Third, and maybe a corollary to the previous point, that stuff is <em>bloated</em>. There’s approximately 3 terabytes of mostly useless JavaScript, ads and all kind of crap I don’t care about.</p>
<p>But most importantly, I understood the hidden costs of most “free” web services. You don’t really own anything. You provide content, and Wordpress or Google or whoever package that content into a neat bundle and servce it to your audience together with whatever else (<em>cough, trackers</em>) they see fit.</p>
<p>That’s one of the reasons that pushed me towards a less-walled-garden approach towards blogging. But there’s a nother reason as well.</p>
<p>As <a href="https://code.divshot.com/geo-bootstrap/">many others</a>, I have fond memories of the late-90s/early-2000s Web 1.0 Internet. There is something warm and fuzzy about those beautifully terrible Geocities pages. They pierced the eyes of the viewer but were wondrous in a way. As I said, I’m not alone: Web 1.0 nostalgia is definitively <a href="https://gizmodo.com/the-great-web-1-0-revival-1651487835">on the rise</a>.</p>
<p>But why? What is not to like in our world of beautiful walled gardens? Surely it is better than those gross-looking Web 1.0 fan sites about some crappy GameBoy game, right? …Right?</p>
<div class="image-box stretch">
<div>
<a href="/blog/assets/images/fan_page_screenshot.png" target="_blank">
<img class="lazy " data-src="/blog/assets/images/fan_page_screenshot.png" alt="View of an old website about Pokemon">
<noscript>
<img src="/blog/assets/images/fan_page_screenshot.png" alt="View of an old website about Pokemon">
</noscript>
</a>
</div>
<div class="image-caption">We're soooo cooler than this in 2019.</div>
</div>
<p><em>Wrong.</em></p>
<p>Well, in many ways the Internet has of course improved over time. It has useful things like search engines and Wikipedia, and convenient subscription-based entertainment like Netflix. It has a whole bunch of nice stuff I could spend hours blabbering about.</p>
<p>But it also has a lot of problems. At this point there are surely many PhD theses about most of them, so I won’t bother. I’m just going to recommend one <a href="https://www.nytimes.com/2019/08/11/world/americas/youtube-brazil.html">New York Times article</a> that explains how YouTube indirectly helped elect a buffoon that <a href="https://extra.globo.com/noticias/brasil/bolsonaro-faz-piada-com-oriental-tudo-pequenininho-ai-veja-video-rv1-1-23668287.html">makes high school-tier penis jokes</a> as president of Brazil.</p>
<p>Now, the specific problem with today’s Internet that I feel is most relevant regarding blogging is how we’re gravitating towards all these “free” services all the time. Medium, for instance, is so <em>nice looking</em> that one doesn’t even think of perhaps using something else. But what happens when <em>everyone</em> uses Medium? First: all the blogs look exactly the same, which is lame. Second, Medium gets all that content and traffic for itself, for free.</p>
<p>Of course, not everyone is skilled enough to build a personal blog from scratch. I am just barely able, as you can see from my lackluster front-end skills (I promise you I’m good on back-end things). So I’m definitively not dismissing the inclusiveness that services like Medium offer.</p>
<p>But as I searched for a way out of the walled gardens and fiddled with <a href="https://jekyllrb.com/">Jekyll</a> for a while, I figured I might as well just <a href="https://tjcx.me/posts/consumption-distraction/">build something to call my own</a>.</p>
<h2 id="so-i-built-this">So I built this.</h2>
<p>The goals were to create the simplest possible blogging system with as little fluff as possible. It should meet what I defined as basic blogging needs: list posts, show post, use tags, use images etc. And also not have 3 terabytes of JavaScript split in 90 requests just to show a fancy menu button.</p>
<p>So I started messing with Nanoc, an excellent Ruby library for static page generation, and came up with <a href="https://github.com/lbrito1/sane-blog-builder">this bad boy</a>.</p>
<p>I won’t pretend these ideas are new. They aren’t! I feel there’s been an increasing amount of <a href="https://code.divshot.com/geo-bootstrap/">Web 1.0 nostalgia</a> going on, and a big part of that is probably fueled by <a href="https://tjcx.me/posts/consumption-distraction/">similar sentiments</a> as those I described. The <a href="https://thebestmotherfucking.website/">longing for simplicity</a> in a world of trillions of new JS frameworks is also quite widespread these days.</p>
<p>This small project is nothing special. There are <a href="https://github.com/remko/blog-skeleton">much better projects</a> <a href="https://clarkdave.net/2012/02/building-a-static-blog-with-nanoc/">available for free</a> on the Internet done by people that actually know what they’re doing with a CSS file. This here is just a tiny vase with some ugly flowers – it would be ridiculous to compare it to the beautiful walled gardens of Medium or Wordpress. But, ugly as they are, they’re <strong>mine</strong>!</p>
tag:lbrito1.github.io,2018-09-03:/blog/2019/01/halving-page-sizes-with-srcset.htmlHalving page sizes with srcset2018-09-03T00:00:00Z2018-09-03T00:00:00Z<p><a href="https://www.webbloatscore.com/">Web bloat</a> is <a href="http://idlewords.com/talks/website_obesity.htm">discussed</a> a lot nowadays. Web pages with fairly straightforward content — such as a Google search results page — are substantially bigger today than they were a few decades ago, even though the content itself hasn’t changed that much. We, web developers, are at least partly to blame: laziness or just <a href="http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how-to-program/">bad programming</a> are definitively part of the problem (of course, laziness might stem from a tight or impossible deadline, and bad code might come from inexperienced programmers — no judgment going on here).</p>
<!-- more -->
<p>But here at Guava we believe that software should not be unnecessarily bloated, even though it could be slightly easier to develop and ship. We believe in delivering high quality production code, and a part of that is not taking the easy way out in detriment of page size.</p>
<p>We frequently have to start working on long-running software that has more than a few coding shortcuts that were probably necessary at the time to ship something quickly to production, but are now aching for optimization. Sometimes the improvements are too time-consuming to be worth our trouble, but sometimes they are an extremely easy win.</p>
<p>Such is the case of separating image assets by pixel density (DPI). As the name implies, DPI (dots per inch) is the amount of dots (or pixels, in our case) that fit in a square inch of screen real estate. The exact definition varies according to context, so for the sake of readability we’ll say that low DPI means the average desktop or laptop screen and budget smartphones, while high DPI means the average smartphone, tablet or higher-resolution computer screens (e.g. Retina displays and 4k monitors).</p>
<p>Nowadays, smartphone customers are important to most online retail businesses, which means that we should serve high DPI images <em>when necessary</em>. The “when necessary” part is important because the easy way out is to <em>always</em> serve high DPI assets, even though the client device might not need them. The problem with this is that high DPI images are roughly 4 times as big as their low DPI counterparts, so low DPI devices would be getting unnecessarily big images for nothing at all — web bloat!</p>
<p>Serving different assets according to the client’s DPI was not a trivial task a few years ago, which means that the web is probably filled with pages that still serve high DPI assets by default to all client browsers. But now that HTML5 is widely adopted we can make good use of <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img">srcset</a> to do just that. To each their own: <code>srcset</code> takes a list of different images and serves the most appropriate one to each client. In image-heavy sites such as retail stores this is an excellent tool to optimize average page size and save a good deal of bandwidth — which means saving money. Smaller images also take less time to load, so customers will also see product images faster than before.</p>
<p>This very simple change allowed us to decrease page sizes in one of our projects over 50% in some of its most-accessed endpoints, and an overall average 25% page size reduction for low DPI customers. Considering that some of the pages were 4 or 5MB big, halving those sizes was a great improvement to our customers — even more so considering that some of them might access our site on low-quality mobile networks, which can be excruciatingly slow sometimes. Considering the proportion of low DPI customers we have on an average day, this improvement saved our client some 7.5% of bandwidth.</p>
<p>Now that we’ve got some hindsight, it seems glaringly obvious that we should have been using this feature all along. But more often than not, extremely simple optimizations such as the one we described are overlooked by less experienced teams or worse — deemed “not important” by management because customers nowadays supposedly can spare a few megabytes per page (that may be so, but they don’t want to!).</p>
<p>We think that bloated web pages hurt everyone involved: web developers, customers and businesses. We strive to achieve what we think is good quality web code: that which delivers optimized, slim web pages to all clients.</p>
<p>By <a href="https://medium.com/@lbrito">Leonardo Brito</a> on <a href="https://medium.com/p/f82a1c5deb26">January 14, 2019</a>.</p>
<p><a href="https://medium.com/@lbrito/halving-page-sizes-with-srcset-f82a1c5deb26">Canonical link</a></p>
<p>Exported from <a href="https://medium.com">Medium</a> on May 1, 2019.</p>
There are many different devices accessing the internet, and they all have different screens. By using srcset to optimize the images served by our webapp, we reduced page sizes by up to 50%.tag:lbrito1.github.io,2018-09-03:/blog/2018/09/10-ways-not-to-do-a-big-deploy.html10 ways not to do a big deploy2018-09-03T00:00:00Z2018-09-03T00:00:00Z<p>Ideally, deploys should be small, concise, easily revertible, fast and with a small or nil footprint on the database. However, no matter how awesome you are, sometimes that is just unattainable and you end up needing to deploy something that is just the opposite: big, messy, hard to revert, painfully slow and rubbing the DB the wrong way. If the deploy messes with a mission-critical part of your software, all the worse for you.</p>
<p>But there are actually many ways you can make those situations even worse. Here are a few bullet points you can follow to guarantee a nightmarish deploy complete with nasty side-effects that will haunt you and your coworkers for days to come.</p>
<!-- more -->
<h2 id="dont-make-aplan">1. Don’t make a plan</h2>
<p>Plans suck. They take time and effort, and don’t add any new features to your software. Planning a deploy requires thinking carefully about what it should do and, more importantly, what it shouldn’t do (but potentially could). A good deploy plan is a step-by-step happy path that is written clearly and concisely, followed by a list of everything nasty that can happen. Making a deploy plan is basically trying to cover as many blind spots as you can before pulling the trigger. But, of course, you and your team are code ninjas or master software crafters or whatever the hippest term is nowadays, and you don’t need a plan! Just wing it. Press the button and solve every problem that might arise in an ad-hoc fashion. What could go wrong?</p>
<h2 id="dont-scheduledowntime">2. Don’t schedule downtime</h2>
<p>Downtime sucks: it usually is in odd hours, late in the night or early in the morning, when customers are fast asleep (and you would very much like to be as well). Why bother blocking public access and redirecting customers to a nice “scheduled maintenance page”? Why gift you and your team with peace of mind and a clear timeframe to work with if you can feel the rush of breaking stuff in production with live customers? Production debugging is the best kind of debugging! Confuse your customers with inconsistent states and leave them waiting while your team tries to fix those bugs that were definitively fixed last Friday night.</p>
<h2 id="dont-have-a-great-logsystem">3. Don’t have a great log system</h2>
<p>Logs are for buggy software, you won’t need them. Why spend time and possibly money with a great logging-as-a-service (LaaS) platform? Just have your whole team <code>ssh</code> into production and watch the log tails. Or, even better, use a terrible LaaS that is slow, unreliable and has a confusing user interface so everyone can get frustrated trying to find errors during the deploy.</p>
<h2 id="dont-have-a-bugtracker">4. Don’t have a bug tracker</h2>
<p>See above: just like logs, bug trackers are also lame. Your awesome PR won’t have any bugs, now, will it? Regressions never happen under your watch. Also, who needs to track exceptions with a great, fast, reliable bug tracking platform when you have logs available? Aren’t you hacker enough to <code>grep</code> every single exception that might be raised?</p>
<h2 id="dont-have-a-stagingserver">5. Don’t have a staging server</h2>
<p>Staging servers are a waste of resources, both time and money. What is the point of having a close-to-exact copy of your production servers, which by this point are radically different from your development environment? Sure, containerization already <em>kind of</em> abstracts many of those differences, but (hopefully) you have network settings, 3rd-party APIs and other stuff that aren’t the same in development, even with containers. So be bold and make the leap from development right to production!</p>
<h2 id="dont-check-your-envvars">6. Don’t check your env vars</h2>
<p>Your project only has like 80 different access tokens, API keys, DB credentials and cache store credentials spread over half a dozen YAMLs. Super easy to keep track of and super hard to mess up with your production, development and (hopefully) staging environments. Don’t triple-check the variables that might have been changed in the deploy, and you’ll secure a few hours of painful debugging in the near future.</p>
<h2 id="dont-guarantee-data-consistency-post-deploy">7. Don’t guarantee data consistency post-deploy</h2>
<p>In a previous step you were told already to make sure that customers can keep using your software mid-deploy, so we’re halfway there already to guaranteeing poor data consistency. Make sure you haven’t mapped out all the points your new code might touch the DB, particularly the DB structure itself. If anything goes wrong, just revert the commit and rollback — don’t ever worry about becoming orphaned or inconsistent.</p>
<h2 id="dont-prepare-for-a-laterollback">8. Don’t prepare for a late rollback</h2>
<p>If everything else fails… wait, it won’t! Some problems can surface during the deploy, sure, but we won’t need to rollback <em>after</em> it is done, right? Right? After everything is settled, and you made a plan (which you totally shouldn’t, remember?) and followed it step-by-step, and all went well, you shouldn’t need to rollback. But let’s say it happens, and a few hours (or days) after the deploy you need to go back to the previous commit/tag/whatever you use. New data will have flowed which might need to be manually converted back to something manageable by the previous version of your software. Don’t think about it, don’t plan for it — it isn’t likely to happen. And if it does, you will have a heck of a time working on oddball and edge cases late in the night. What is not to love?</p>
<h2 id="dont-communicate-efficiently-with-yourteam">9. Don’t communicate efficiently with your team</h2>
<p>You already know you should have terrible log and error tracking systems. Add insult to injury and don’t talk to your coworkers in a quick, direct and clear way. Long pauses are great for dramatic effect, especially when your coworkers are waiting for a timely answer. Be vague about what you’re doing. Hit the rollback button and “forget” to tell people about it. In general, just be as confusing and unavailable as possible.</p>
<p>Following all of the points above might lead to a “perfect storm” situation, and making sure you don’t follow them will surely make things easier on you and your team. But even if you have great deploy practices in place, sometimes things just fall apart. There will always be blind spots, and it is in their nature to be more or less unpredictable. That is just the way things are with software development. Which leads us to our 10th and final point in this guide to terrible deploys:</p>
<h2 id="dont-be-patient-and-understanding-with-your-coworkers-if-everything-fallsapart">10. Don’t be patient and understanding with your coworkers if everything falls apart!</h2>
<p>By <a href="https://medium.com/@lbrito">Leonardo Brito</a> on <a href="https://medium.com/p/f536d1ad9a5a">September 3, 2018</a>.</p>
<p><a href="https://medium.com/@lbrito/10-ways-not-to-do-a-big-deploy-f536d1ad9a5a">Canonical link</a></p>
<p>Exported from <a href="https://medium.com">Medium</a> on May 1, 2019.</p>
Ideally, deploys should be small, concise and easily revertible. However, sometimes everything just goes kaput. Let's take a look on just how miserable a deploy can get.tag:lbrito1.github.io,2018-06-12:/blog/2018/06/working-remotely-in-a-non-remote-company.htmlWorking remotely in a non-remote company2018-06-12T00:00:00Z2018-06-12T00:00:00Z<div class="image-box stretch">
<div>
<a href="/blog/assets/images/goiabada/1*mgVZOuAHmp9Ipm2asL0IQQ.jpg" target="_blank">
<img class="lazy " data-src="/blog/assets/images/goiabada/1*mgVZOuAHmp9Ipm2asL0IQQ.jpg" alt="">
<noscript>
<img src="/blog/assets/images/goiabada/1*mgVZOuAHmp9Ipm2asL0IQQ.jpg" alt="">
</noscript>
</a>
</div>
</div>
<p>We’re a small team here at Guava, and we’ve always considered ourselves <em>remote friendly.</em> Most of us work remotely every now and then pushed by varied <em>force majeure</em> situations— be it the flu, the need to supervise renovation or construction work at home, flash floods near the office, receiving guests at home or any number of other situations. We’ve also had a few of us working remotely for a few days or weeks while traveling to or back from a conference, or when visiting relatives that live out of town. In other words, remote working has always been a very temporary and circumstantial thing among us.</p>
<p>We have a nice office (with hammocks!), excellent work equipment, great desk space, comfortable chairs, plenty of snacks and comfort food and an infinite supply of coffee. We’re also easygoing and overall pleasant people (well, most of us are) to work with several hours a day, and some of us are even mildly funny.</p>
<!-- more -->
<p>I bid adieu to my coworkers, the coffee machine, the nice desk and the hammocks and traveled abroad to try out being a remote worker (some prefer the term <em>digital nomad</em> — to me, it seems a bit preposterous to compare month-long stays in modern urban dwellings with electricity and wireless internet with the traditional nomadic lifestyle) for half a year. A few weeks before leaving, I read the interesting <a href="https://basecamp.com/books/remote">Remote: Office Not Required</a>, which I vividly recommend to anyone considering working remotely. Some of the challenges I faced during my time as a remote worker were foretold by the book, while others were a complete surprise. Here are a few of the things I learned firsthand about remote work:</p>
<h2 id="it-takes-time-toadjust">It takes time to adjust.</h2>
<p>Your mind takes some time to adjust to working remotely. In many ways, working remotely feels like a completely new job — even if you’ve been in the same company and position for years.</p>
<p>Some people have more trouble with this than others, but everyone will take some time to adjust. The important lesson here — for worker <em>and employer</em> — is to have patience. Steep as it might be, the learning curve of adapting to remoteness will eventually plateau out.</p>
<h2 id="it-is-easier-when-youre-well-acquainted-with-yourteam">It is easier when you’re well acquainted with your team.</h2>
<p>Starting a new project — be it a new job or just a new assignment involving different team members — may be daunting. Starting remote work already with good rapport with your coworkers helps tremendously, as you will feel more at ease to talk to people.</p>
<p>It is important to feel comfortable enough to let your team know if something is going wrong right away, for example, as opposed to keeping it to yourself and suffer silently. Good rapport between developers also means it will be easier to understand each other when discussing technical problems.</p>
<h2 id="it-is-easier-when-youre-not-the-only-remoteperson"><strong>It is easier when you’re not the only remote person.</strong></h2>
<p>Being “the remote guy” is a thing. People tend to forget people they don’t see every day, and you have to be comfortable with the low profile that comes with working remotely.</p>
<p>Having other remote workers in your team helps a bit, creating that nice “we’re all on the same boat” feeling.</p>
<h2 id="you-need-to-be-able-to-communicate-verywell">You need to be able to communicate very well.</h2>
<p>A huge part of working as a developer is being able to communicate well with other developers and with normal human beings. A programming genius that isn’t able to explain what he’s doing to his non-genius co-workers will likely not be a very good developer overall.</p>
<p>Language is one of man’s great achievements as a species, and it carries the weight and complexity of thousands of years of mutation. Expressing yourself verbally is hard enough, but we also have a myriad of non-verbal communication cues that we unconsciously rely on to communicate with one another — which you won’t have as a remote worker (at least most of the times).</p>
<p>Sure, you can occasionally call the HQ and video-chat with someone. But that is just not practical enough most of the times. As a remote worker, I find myself heavily relying on written communication with the rest of the team.</p>
<p>Every challenge brings a chance to learn something. Because of the challenges and limitations of working remotely, the experience helps you grow professionally in some ways:</p>
<ul>
<li>Guidance from more experienced coworkers or bosses is much more rarefied, which force you to exercise self-teaching and pro-activity.</li>
<li>The need to communicate more often through asynchronous text-first chats helps you develop language skills (and patience).</li>
<li>Working away from the office and your coworkers makes you appreciate them more when eventually returning to the HQ.</li>
</ul>
<p>Of course, there are many other benefits that come to mind when thinking about remote work, such <a href="https://www.thriveglobal.com/stories/30386-a-2-year-stanford-study-shows-the-astonishing-productivity-boost-of-working-from-home">as increased productivity and financial savings</a>, and there are already studies and books that got those covered. Which is not to say that remote work is some kind of panacea: it has fundamental disadvantages when compared with traditional work inside a brick-and-mortar office building, the most obvious and important being the lack of human contact with your fellow workers. The solitude and heavy reliance on written, asynchronous communication that often comes with remote work might not be your cup of tea.</p>
<p>Remote work is neither a universal solution nor something completely out of reach for the average developer. Although it won’t be to everyone’s taste, it is definitively available to everyone (or should be). This semester abroad taught me that it is plainly possible and viable for a developer in a small, non-remote (but remote-friendly) software company to work far away from the HQ, and both sides have a lot to gain from the experience. It is really a win-win scenario, and people should try it more often.</p>
<p>By <a href="https://medium.com/@lbrito">Leonardo Brito</a> on <a href="https://medium.com/p/ce9e39645f85">June 12, 2018</a>.</p>
<p><a href="https://medium.com/@lbrito/working-remotely-in-a-non-remote-company-ce9e39645f85">Canonical link</a></p>
<p>Exported from <a href="https://medium.com">Medium</a> on May 1, 2019.</p>
Here's what I learned after working half a year remotely in a remote-friendly (but mostly non-remote) company.