<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Cloud on Clément Joly – Open-Source, Rust &amp; SQLite</title><link>https://cj.rs/tags/cloud/</link><description>Recent content in Cloud on Clément Joly – Open-Source, Rust &amp; SQLite</description><generator>Hugo -- 0.154.3</generator><language>en</language><copyright>Clément Joly</copyright><lastBuildDate>Mon, 09 Mar 2026 22:59:21 +0000</lastBuildDate><atom:link href="https://cj.rs/tags/cloud/index.xml" rel="self" type="application/rss+xml"/><item><title>Link Aggregator Infrastructure</title><link>https://cj.rs/blog/link-aggregator-infrastructure/</link><pubDate>Thu, 21 Mar 2024 14:40:33 +0000</pubDate><guid>https://cj.rs/blog/link-aggregator-infrastructure/</guid><description>A look at the surprisingly simple infrastructure of link aggregators like HackerNews or Lobste.rs.</description><content:encoded><![CDATA[



  
  
  
  

  <div class="alert alert-warning">
    <p class="alert-heading">
      ⚠️
      
        DISCLAIMER
      
    </p>
    <p>This is not a comment on Reddit’s future profitability or any reflection on the stock performance.
This is a technical discussion on how a relatively simple link-aggreggator can be hosted, with the trade-offs.</p>
  </div>



<p>Reddit went public today.
Its <a href="https://www.sec.gov/Archives/edgar/data/1713445/000162828024011448/reddit-sx1a2.htm#i1b9a579e78a34dfa99f7f26daeec195b_40">IPO document</a> states<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> that they use AWS and GCP.
But I cannot find a break-down of infrastructure costs separately.</p>
<p>That makes me wonder: what does the infrastructure costs of a pure link-aggreggator look like?</p>
<p>I’ll go over two sites aggregating links related to computing, <a href="https://lobste.rs/">Lobsters</a> and <a href="https://news.ycombinator.com">Hacker News</a>.
They are relatively high-traffic websites.
<a href="https://news.ycombinator.com/item?id=39137882">Sites</a> <a href="https://news.ycombinator.com/item?id=39192941">popular</a> on Hacker News <a href="https://news.ycombinator.com/item?id=39746350">in particular</a> <a href="https://news.ycombinator.com/item?id=39631607">often</a> <a href="https://news.ycombinator.com/item?id=39536126">go down</a> <a href="https://news.ycombinator.com/item?id=39419248">due to</a> <a href="https://news.ycombinator.com/item?id=39224966">the sudden</a> <a href="https://news.ycombinator.com/item?id=39137882">popularity</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.
It’s a <a href="https://lobste.rs/search?q=lobstered&amp;what=comments&amp;order=newest">little less common</a> with Lobsters, but it happens there as well.</p>
<h2 id="lobsters">Lobste.rs</h2>
<p><a href="https://lobste.rs/">Lobsters</a> is very transparent on its infrastructure:</p>




  <figure>
    <blockquote cite="https://lobste.rs/about">
      <p>Lobsters is hosted on three VPSs at DigitalOcean: a s-4vcpu-8gb for the web server, a s-2vcpu-4gb for the mariadb server, and a s-1vcpu-1gb for the IRC bot
[…]  we use restic for backups to b2</p>

    </blockquote>
    
      <figcaption class="blockquote-caption">
        
          <cite style="text-align: right"><a href="https://lobste.rs/about">https://lobste.rs/about</a></cite>
          <br/>
        
        
      </figcaption>
    
  </figure>



<p>Using <a href="https://slugs.do-api.dev/">public prices</a>, those servers cost $78 ($48+$24+$6) a month to run. Obviously, there are other costs for monitoring, backups, managed DNS<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>…</p>
<p>I’ve <a href="https://lobste.rs/s/qpwghe">requested some numbers</a> on the load the site is facing and infrastructure utilization.
I’ll update that blog post with the results.</p>
<h2 id="hacker-news">Hacker News</h2>
<p>It’s harder to find details on <a href="https://news.ycombinator.com">Hacker News’s</a> infrastructure, but the moderator of the site answered questions about the infra in <a href="https://news.ycombinator.com/item?id=28478379">this thread</a><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>:</p>




  <figure>
    <blockquote cite="https://news.ycombinator.com/item?id=16076041">
      <p>We’re recently running two machines (master and standby) at M5 Hosting. All of HN runs on a single box, nothing exotic:</p>
<pre><code>  CPU: Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz (3500.07-MHz K8-class CPU)
  FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 hardware threads
  Mirrored SSDs for data, mirrored magnetic for logs (UFS)
</code></pre>

    </blockquote>
    
      <figcaption class="blockquote-caption">
        
          <cite style="text-align: right"><a href="https://news.ycombinator.com/item?id=16076041">https://news.ycombinator.com/item?id=16076041</a></cite>
          <br/>
        
        
      </figcaption>
    
  </figure>







  <figure>
    <blockquote cite="https://news.ycombinator.com/item?id=28479595">
      <p>Number of daily requests has gone up closer to 6M</p>

    </blockquote>
    
      <figcaption class="blockquote-caption">
        
          <cite style="text-align: right"><a href="https://news.ycombinator.com/item?id=28479595">https://news.ycombinator.com/item?id=28479595</a></cite>
          <br/>
        
        
      </figcaption>
    
  </figure>







  <figure>
    <blockquote cite="https://news.ycombinator.com/item?id=28496642">
      <p>We use an Nginx front end for that [caching]. It all runs on the same box though.</p>

    </blockquote>
    
      <figcaption class="blockquote-caption">
        
          <cite style="text-align: right"><a href="https://news.ycombinator.com/item?id=28496642">https://news.ycombinator.com/item?id=28496642</a></cite>
          <br/>
        
        
      </figcaption>
    
  </figure>



<p>So a <a href="https://ark.intel.com/content/www/us/en/ark/products/92983/intel-xeon-processor-e5-2637-v4-15m-cache-3-50-ghz.html">CPU from 2016</a> handles the load for a very popular site.</p>
<p>No CDN either, requests are made to <code>news.ycombinator</code>, with DNS records pointing to M5 Hosting IPs (<code>2606:7100:1:67::26</code> and <code>209.216.230.207</code>)<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>.</p>
<h2 id="conclusions">Conclusions</h2>
<p>Those two popular link-aggregator offer to submit and comment on links, two historical features of Reddit.
Of course now Reddit also has chat, image hosting and presumably a bunch of other features.
It also has <a href="https://www.sec.gov/Archives/edgar/data/1713445/000162828024011448/reddit-sx1a2.htm#i1b9a579e78a34dfa99f7f26daeec195b_40">“73.1 million daily active uniques (“DAUq”), around the world”</a>, probably orders of magnitude more than Hacker News.
Finally, Reddit might<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> theoretically have higher uptime, because <a href="https://news.ycombinator.com/item?id=35334292">Hacker News</a> and <a href="https://lobste.rs/s/whbyxt/2023_08_30_outage_postmortem">Lobsters</a> sometimes have their single-point-of-failure fail, for instance during an upgrade.</p>
<p>After those incidents though, users tend to post understanding comments, pointing out that they thought their Internet connection was faulty, before thinking that the aggregator could be down.
This sort of simple hosting might be the right trade-off for a non-profit link-aggregator: simple, so rarely down due to an operational mistake and with base components simple enough that users are understanding when an outage happens.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>“Currently our cloud service infrastructure is run on our cloud services providers (“CSPs”), which are currently Amazon Web Services and Google Cloud Platform”.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>And those links are only from what I could <a href="https://hn.algolia.com/?dateRange=pastYear&amp;page=0&amp;prefix=true&amp;query=hug%20of%20death&amp;sort=byDate&amp;type=comment">quickly find</a> in comments from the last two month.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>They don’t mention using a CDN on Lobsters.
It’s consistent with what I observe loading the homepage, where all requests are made to the host <code>lobest.rs</code>. A quick DNS lookup returns <code>67.205.189.7</code> and <code>2604:a880:400:d0::2082:1001</code>, and both IPs are owned by DigitalOcean<sup id="fnref1:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>These details are consistent with this <a href="https://news.ycombinator.com/item?id=35334292">comment</a> from 2023.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>According to <a href="https://ipinfo.io/">ipinfo</a>.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>It’s hard to find reliable uptime data for the 3 sites and compare.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item></channel></rss>