I would like to see visitor numbers at times of day and probably a few more things in the future. In the past I would have stuck Google Analytics on and got on with my day. Now I want to avoid the cookie banner, and the shitty tracking of my users. I don't want to code analytics, I want to work on features. It feels like my nginx logs have the data I need. Is there a simple tool that analyses this? Or should I just find a less crappy front end analytics service?.
What is everyone using in 2023?
The site is at Bonsai-garden.com just so you can imagine what I am going to need.
GoAccess, AWStats, or other log analyzers can get you a lot of the same data, but they also have even more trouble identifying bots and have little to no ability for customization. Also if you use client-side only javascript functionality on your site there's no ability to track that.. so if you wanted to track how many people zoom in on cool bonsai tree pics you wouldn't be able to do that with a log analyzer. Those also don't work well with CDNs.
There's other more sophisticated tools like Matomo and Piwik Pro that are similar to GA3 in functionality but have the ability to work without cookies if that's what you want. Looks like you don't need something that involved. I'd probably go with either Plausible or possibly Cloudflare if you're looking for something free. Looks like you're already using Cloudflare for some CDN assets.
I've written a book on this subject that covers 15 different options: https://www.quantable.com/analytics/google-analytics-alterna...
FeatureBase is a super fast, highly efficient, in-memory analytical data store which may be queried using SQL. We have customers handling 10s of billions of events with it. It's free, Open Source and available here: https://featurebase.com/. The technology is based on Roaring Bitmaps: https://roaringbitmap.org/
There are reference Docker containers which may be used for development or reference for building a deployment: https://github.com/FeaturebaseDb/featurebase-examples
We have a Discord here if you'd like to discuss what can be done with the product: https://discord.com/invite/featurefirstai
It's got a generous free tier, but costs a bit if you go over 500k events a month.
It's built on clickhouse, while matomo is built on MySQL which in my opinion is pretty much a nail in the coffin for Matomo's future proofing. Matomo is fine for simple analytics needs but it sucks for tracking sites with custom events with custom dimensions.
I use Plausible [1] which is basically GA but respects privacy and its code is much smaller (<1KB). I’m happy with it. Doesn’t do more or less than satisfying my need for visitor insights.
I did this 1) because of EU privacy rulings and 2) because Google Analytics is deprecating GA3 (which is website focused) for GA4 (which is app focused). The UX for GA4 SUCKS. It is so hard to find basic info like what pages are people most looking at, a realtime view, what are all the domains referring people to me, etc.
So far I like plausible better, it's simple and focused on websites. Whereas fathom seems ready to hook into more complex martech that I have no interest in.
That said, Plausible supports Google Analytics already and Fathom doesn't yet. In contrast, Plausible doesn't support TFA and Fathom does.
Plausible also has alerts on spikes and sends summary emails at your desired frequency whereas Fathom doesn't seem to do either.
So I guess the analytics market is still somehow in early days.
Also, Michael Lynch pointed out to me that if you plan to sell a site, buyers expect access to Google Analytics data specifically. Something to keep in mind.
Also, it's good to have server side analytics since this will uncover at least 100% more legitimate users (on tech sites especially). So I tried out Fastly and Netlify which show basic analytics but don't give you access to access logs. I ended up just hosting on an OVH VM with 7day block-device backups in case it goes sideways.
Part of the installation is configuring which log format your servers use. If using a non standard format one can map the related fields.
If configuring AWStats to perform reverse DNS lookups I would suggest also installing Unbound DNS specifically on that machine and setting the following in /etc/unbound/unbound.conf to minimize load on your upstream servers perhaps even higher:
cache-min-ttl: 86400
cache-max-ttl: 1209600
serve-expired: yes
serve-expired-ttl: 259200
serve-expired-reply-ttl: 30
serve-expired-ttl-reset: yes
val-bogus-ttl: 600
cache-max-negative-ttl: 86400
serve-expired-client-timeout: 1800
The above is specifically for machines dedicated to looking up a very large number of PTR records, not for a home router. Some DNS providers rate limit mass-lookups of IP addresses. PTR records are slow to change so a high TTL override on a log processing box is generally fine unless you care about short lived PTR records AWS, Azure, etc..> "I would like to see visitor numbers at times of day and probably a few more things in the future."
GA is a very complex tool, that often is complete overkill for small sites. So when you ask for GA alternatives, you will probably hear Matomo as a suggestion – but this also is in the same area of complexity as GA and would probably not be a good fit for your project (and it wouldn't solve your cookie problem without proper configuration).
So maybe a better question would be "What analytics tool should I use for project XYZ?"
If you want a minimal approach, take a look at Umami (https://umami.is/), which is about as minimal as it gets (when talking about frontend JS analytics tools).
[0] https://www.cloudflare.com/web-analytics/
[1] https://blog.cloudflare.com/privacy-first-web-analytics/
It is also privacy-friendly and fast (using Clickhouse).
https://github.com/matomo-org/matomo
https://github.com/plausible/analytics
https://github.com/arp242/goatcounter
All 3 of them open source and you can host yourself. If you don't want to self host all 3 have options for you.
I am currently self hosting Matomo and am happy with it.
Fun timing for you to ask as just running a poll to see what others are using:
They also have a series of books written on analytics and user behaviors, which you can find by searching for amplitude playbooks.
They do offer a free tier and can ingest from different data sources if you want to provide your own.
(Disclaimer: My wife works there, but my workplace also uses it to help determine UX issues in our product.)
It parses your logs and generates reports in HTML or text form.
Their summary basically is: Privacy protection is our business model / Your data is always encrypted / We never, ever, ever store any personal data about your visitors. No cookie banners. / We are an EU-based company with EU-based servers. / You own your data.