The issue with a client app backing up dropbox and onedrive folders on your computer is the files on demand feature, you could sync a 1tb onedrive to your 250gb laptop but it's OK because of smart/selective sync aka files on demand. Then backblaze backup tries to back the folder up and requests a download of every single file and now you have zero bytes free, still no backup and a sick laptop.
You could oauth the backblaze app to access onedrive directly, but if you want to back your onedrive up you need a different product IMO.
That doesn't really make a lot of sense, though. Reading a file that's not actually on disk doesn't download it permanently. If I have zero of 10TB worth of files stored locally on my 1TB device, read them all serially, and measure my disk usage, there's no reason the disk should be full, or at least it should be cache that can be easily freed. The only time this is potentially a problem is if one of the files exceeds the total disk space available.
Hell, if I open a directory of photos and my OS tries to pull exif data for each one, it would be wild if that caused those files to be fully downloaded and consume disk space.
Shoutout to Arq backup which simply gives you an option in backup plans for what to do with cloud only files:
- report an error
- ignore
- materialize
Regardless, if you make it back up software that doesn’t give this level of control to users, and you make a change about which files you’re going to back up, you should probably be a lot more vocal with your users about the change. Vanishingly few people read release notes.
Unless it does something very weird it won't trigger all those files to download at the same time. That shouldn't be a worry.
And, as a separate note, they shouldn't be balking at the amount of data in a virtualized onedrive or dropbox either considering the user could get a many-terabyte hard drive for significantly less money.
> Unless it does something very weird it won't trigger all those files to download at the same time. That shouldn't be a worry.
The moment you call read() (or fopen() or your favorite function), the download will be triggered. It's a hook sitting between you and the file. You can't ignore it.
The only way to bypass it is to remount it over rclone or something and use "ls" and "lsd" functions to query filenames. Otherwise it'll download, and it's how it's expected to work.
Maybe it'll, maybe it won't, but it'll cycle all files in the drive and will stress everything from your cloud provider to Backblaze, incl. everything in between; software and hardware-wise.
That sounds very acceptable to get those files backed up.
It shouldn't stress things to spend a couple weeks relaying a terabyte in small chunks. The most likely strain is on my upload bandwidth and yeah that's the cost of cloud backup, more ISPs need to improve upload.
I mean, cycling a couple of terabytes of data over a 512GB drive is at least full 4 writes, which is too much for that kind of thing.
> more ISPs need to improve upload.
I was yelling the same things to the void for the longest time, then I had a brilliant idea of reading the technical specs of the technology coming to my home.
Lo and behold, the numbers I got were the technical limits of the technology that I had at home (PON for the time being), and going higher would need a very large and expensive rewiring with new hardware and technology.
4 writes out of what, 3000? For something you'll need to do once or twice ever? It's fine. You might not even eat your whole Drive Write Per Day quota for the upload duration, let alone the entire month.
> the technical limits of the technology that I had at home (PON for the time being)
How do you know how often those files need to be backed up without reading them? Timestamps and sizes are not reliable, only content hashes. How do you get a content hash? You read the file.
Depends on your device capacity and how much is in actual use. Wear leveling things also wear things while it moves things around.
> For something you'll need to do once or twice ever?
I don't know you, but my cloud storage is living, and even if it's not living, if the software can't smartly ignore files, it'll pull everything in, compare and pass without uploading, causing churns in every backup cycle.
> Isn't that usually symmetrical? Is yours not?
GPON (Gigabit PON) is asymmetric. Theoretical limits is 2.4Gbps down, 1.2Gbps up. I have 1000Mbit/75Mbit at home.
This is a complexity that makes it harder, but not insurmountable.
It would be reasonable to say that if you run the file sync in a mode that keeps everything locally, then Backblaze should be backing it up. Arguably they should even when not in that mode, but it'll churn files repeatedly as you stream files in and out of local storage with the cloud provider.
> Arguably they should even when not in that mode, but it'll churn files repeatedly as you stream files in and out of local storage with the cloud provider.
When you have a couple terabytes of data in that drive, is it acceptable to cycle all that data and use all that bandwidth and wear down your SSD at the same time?
Also, high number of small files is a problem for these services. I have a large font collection in my cloud account and oh boy, if I want to sync that thing, the whole thing proverbially overheats from all the queries it's sending.
Reading your comments, it sounds like you are arguing it is impossible to backup files in Dropbox in any reasonable way, and therefore nobody should backup their cloud files. I know you haven’t technically said that, but that’s what it sounds like.
I assume you don’t think that, so I’m curious, what would you propose positively?
> I know you haven’t technically said that, but that’s what it sounds like.
Yes, I didn't technically said that.
> It sounds like you are arguing it is impossible to backup files in Dropbox in any reasonable way, and therefore nobody should backup their cloud files.
I don't argue neither, either.
What I said is with "on demand file download", traditional backup software faces a hard problem. However, there are better ways to do that, primary candidate being rclone.
You can register a new application ID for your rclone installation for your Google Drive and Dropbox accounts, and use rclone as a very efficient, rsync-like tool to backup your cloud storage. That's what I do.
I'm currently backing up my cloud storages to a local TrueNAS installation. rclone automatically hash-checks everything and downloads the changed ones. If you can mount Backblaze via FUSE or something similar, you can use rclone as an intelligent MITM agent to smartly pull from cloud and push to Backblaze.
Also, using RESTIC or Borg as a backup container is a good idea since they can deduplicate and/or only store the differences between the snapshots, saving tons of space in the process, plus encrypting things for good measure.
But if the files are only on the remote storage and not local, chances are they haven't been modified recently, so it shouldn't download them fully, just check the metadata cache for size / modification time and let them be if they didn't change.
So, in practice, you shouldn't have to download the whole remote drive when you do an incremental backup.
You can't trust size and modification time all the time, though mdate is a better indicator, it's not foolprooof. The only reliable way will be checksumming.
Interestingly, rclone supports that on many providers, but to be able to backblaze support that, it needs to integrate rclone, connect to the providers via that channel and request checks, which is messy, complicated, and computationally expensive. Even if we consider that you won't be hitting API rate limits on the cloud provider.
I guess the problem with Backblaze's business model with respect to Backblaze Personal is that it is "unlimited". They specifically exclude linux users because, well, we're nerds, r/datahoarders exists, and we have different ideas about what "unlimited" means. [1]
This is another example in disguise of two people disagreeing about what "unlimited" means in the context of backup, even if they do claim to have "no restrictions on file type or size" [2].
Any company that does the "unlimited*" shenanigans are automatically out from any selection process I had going, wherever they use it. It's a clear signal that the marketing/financial teams have taken over the businesses, and they'll be quick to offload you from the platform given the chance, and you'll have no recourse.
Always prefer businesses who are upfront and honest about what they can offer their users, in a sustainable way.
You can only do it during growth phases or if there’s complimentary products with margin. The story I was told about Office 365 was the when they were using spinning disk, exchange was IOPS-bound, so they had lots of high volume, low iops storage to offer for SharePoint. Google has a similar story, although neither are really unlimited, but approaching unlimited with for large customers.
Once growth slows, churn eats much of the organic growth and you need to spend money on marketing.
> And statistically-speaking, is viable as long as a company keeps its users to a normal distribution.
Doing a bait-and-switch on a percentage of your paying customers, no matter how small the percentage is, may be "viable" for the company, but it's a hostile experience for those users, and companies deserve to be called out for it.
I mean, in this universe we live in everything is limited somehow.
I do wish it was a word that had to be completely dropped from marketing/adverting.
For example there is not unlimited storage, hell the visible universe has a storage limit. There is not unlimited upload and download speed, and what if when you start using more space they started exponentially slowing the speed you could access the storage? Unlimited CPU time in processing your request? Unlimited execution slots to process your request? Unlimited queue size when processing your requests.
Hence everything turns into the mess of assumptions.
> I mean, in this universe we live in everything is limited somehow.
Yes, indeed, most relevant in this case probably "time" and "bandwidth", put together, even if you saturate the line for a month, they won't throttle you, so for all intents and purposes, the "data cap" is unlimited (or more precise; there is no data cap).
If they limit the rate of speed it's technically limited which really makes me wonder how they legally can say these things. I guess it means in a lot of cases it's like Comcast where they also limit the data a month perhaps but dang.
It’s not unlimited. The limit might be very high these days, but it’s at most bandwidth times duration. And while that sounds trivial, it does mean they aren’t selling you an infinity of a resource.
> And they have the necessary pipes to serve the rate they sell you 24/7
I doubt they have those pipes, at least if every of their customers (or a sufficiently large amount) would actually make use of that.
Second question would be, how long they would allow you to utilize your broadband 24/7 at max capacity without canceling your subscription. Which leads back to the point the person I replied to was making: If you truly make use of what is promised, they cancel you. Hence it is not a faithful offer in the first place.
> Nobody has turned the moon into a hard drive yet.
Not important here because backblaze only has to match the storage of your single device. Plus some extra versions but one year multiplied by upload speed is also a tractable amount.
Since I know how many of those businesses are run I'll let you in on the very obvious secret: there’s zero chance they have enough uplink to accommodate everyone using 100% of their bandwidth at the same time, and probably much less than that.
Residential network access is oversold as everything else.
The only difference with storage is there’s a theoretical maximum on how much a single person can use.
But you could just as well limit backup upload speed for similar effect. Having something about fair use in ToS is really not that different.
I've been running RPI-based torrent client 24/7 in several countries and never experienced that. Eats a few TBs per month, not the full line, but pretty decent amount. I guess it really depends on the country.
I’ve used Spectrum and their predecessors since the 90s. Never ran into this, although the upstream speeds are ridiculously slow, and they used to force Netflix traffic to an undersized peer circuit.
I'm unsure if you're sarcastic or not, never have I've used a ISP that would throttle you, for any reason, this is unheard of in the countries I've lived, and I'm not sure many people would even subscribe to something like that, that sounds very reverse to how a typical at-home broadband connection works.
Of course, in countries where the internet isn't so developed as in other parts of the world, this might make sense, but modern countries don't tend to do that, at least in my experience.
I think most are familiar with throttling because most (all?) phone plans have some data cap at one point, but I don't think I've heard of any broadband connections here with data caps, that wouldn't make any sense.
It’s funny that the same person asking for linux support would complain about B2 “not being for home users”. I sync my own backups to B2 and would set that up over installing linux any day of the week! It’s extremely easy.
Yea, that's pretty shady. Either don't call your service unlimited or bump up the prices so you can survive occasional datahoarder, called them out on it many years ago.
If a company uses the word unlimited to describe their service, but then attempts to weasel out of it via their T&Cs, that doesn't constitute a disagreement over the meaning of the word unlimited. It just means the company is lying.
From a philosophical standpoint, I agree, but it terms of service providers "unlimited" has always pretty much always been synonymous with "unmetered" (i.e. we don't charge you for traffic, but we will still throttle you if you are affecting service reliability for other customers)
I use them for the b2 bucket style storage where this happens. Its expensive per gig compared to the cost of a working personal unlimited desktop account. I like to visit their reddit page occasionally and its a constant stream of desktop client woes and stories of restoring problems and any time b2 is mentioned its like "but muh 50 terabytes" lol
It's cheaper if you have multiple computers with normal amounts of data though. My whole family is on my B2 account (Duplicati backing up eight computers each to a separate bucket), and it's $10/month.
I can understand in theory why they wouldn't want to back up .git folders as-is. Git has a serious object count bloat problem if you have any repository with a good amount of commit history, which causes a lot of unnecessary overhead in just scanning the folder for files alone.
I don't quite understand why it's still like this; it's probably the biggest reason why git tends to play poorly with a lot of filesystem tools (not just backups). If it'd been something like an SQLite database instead (just an example really), you wouldn't get so much unnecessary inode bloat.
At the same time Backblaze is a backup solution. The need to back up everything is sort of baked in there. They promise to be the third backup solution in a three layer strategy (backup directly connected, backup in home, backup external), and that third one is probably the single most important one of them all since it's the one you're going to be touching the least in an ideal scenario. They really can't be excluding any files whatsoever.
The cloud service exclusion is similarly bad, although much worse. Imagine getting hit by a cryptoworm. Your cloud storage tool is dutifully going to sync everything encrypted, junking up your entire storage across devices and because restoring old versions is both ass and near impossible at scale, you need an actual backup solution for that situation. Backblaze excluding files in those folders feels like a complete misunderstanding of what their purpose should be.
Why should a file backup solution adapt to work with git? Or any application? It should not try to understand what a git object is.
I’m paying to copy files from a folder to their servers just do that. No matter what the file is. Stay at the filesystem level not the application level.
I'm not saying Backblaze should adapt to git; the issue isn't application related (besides git being badly configured by default; there's a solution with git gc, it's just that git gc basically never runs).
It's that to back up a folder on a filesystem, you need to traverse that folder and check every file in that folder to see if it's changed. Most filesystem tools usually assume a fairly low file count for these operations.
Git, rather unusually, tends to produce a lot of files in regular use; before packing, every commit/object/branch is simply stored as a file on the filesystem (branches only as pointers). Packing fixes that by compressing commit and object files together, but it's not done by default (only after an initial clone or when the garbage collector runs). Iterating over a .git folder can take a lot of time in a place that's typically not very well optimized (since most "normal" people don't have thousands of tiny files in their folders that contain sprawled out application state.)
The correct solution here is either for git to change, or for Backblaze to implement better iteration logic (which will probably require special handling for git..., so it'd be more "correct" to fix up git, since Backblaze's tools aren't the only ones with this problem.)
7za (the compression app) does blazingly fast iteration over any kind of folder. This doesn't require special code for git. Backblaze's backup app could do the same but rather than fix their code they excluded .git folders.
When I backup my computer the .git folders are among the most important things on there. Most of my personal projects aren't pushed to github or anywhere else.
Fortunately I don't use Backblaze. I guess the moral is don't use a backup solution where the vendor has an incentive to exclude things.
I think it's understandable for both Backblaze and most users, but surely the solution is to add `.git` to their default exclusion list which the user can manage.
I think they shouldn't back up git objects individually because git handles the versioning information. Just compress the .git folder itself and back it up as a single unit.
Better yet, include dedpulication, incremental versioning, verification, and encryption. Wait, that's borg / restic.
This is a joke, but honestly anyone here shouldn't be directly backing up their filesystems and should instead be using the right tool for the job. You'll make the world a more efficient place, have more robust and quicker to recover backups, and save some money along the way.
Eh, you really shouldn't do that for any kind of file that acts like a (an impromptu) database. This is how you get corruption. Especially when change information can be split across more than one file.
It's probably primarily because Linus is a kernel and filesystem nerd, not a database nerd, so he preferred to just use the filesystem which he understood the performance characteristics of well (at least on linux).
> SourceGear Vault Pro is a version control and bug tracking solution for professional development teams. Vault Standard is for those who only want version control. Vault is based on a client / server architecture using technologies such as Microsoft SQL Server and IIS Web Services for increased performance, scalability, and security.
I decided to look into this (git gc should also be doing this), and I think I figured out why it's such a consistent issue with git in particular. Running git gc does properly pack objects together and reduce inode count to something much more manageable.
It's the same reason why the postgres autovacuum daemon tends to be borderline useless unless you retune it[0]: the defaults are barmy. git gc only runs if there's 6700 loose unpacked objects[1]. Most typical filesystem tools tend to start balking at traversing ~1000 files in a structure (depends a bit on the filesystem/OS as well, Windows tends to get slower a good bit earlier than Linux).
To fix it, running
> git config --global gc.auto 1000
should retune it and any subsequent commit to your repo's will trigger garbage collection properly when there's around 1000 loose files. Pack file management seems to be properly tuned by default; at more than 50 packs, gc will repack into a larger pack.
[0]: For anyone curious, the default postgres autovacuum setting runs only when 10% of the table consists of dead tuples (roughly: deleted+every revision of an updated row). If you're working with a beefy table, you're never hitting 10%. Either tune it down or create an external cronjob to run vacuum analyze more frequently on the tables you need to keep speedy. I'm pretty sure the defaults are tuned solely to ensure that Postgres' internal tables are fast, since those seem to only have active rows to a point where it'd warrant autovacuum.
A few thousand files shouldn't be a problem to a program designed to scan entire drives of files. Even in a single folder and considering sloppy programs I wouldn't worry just yet, and git's not putting them in a single folder.
I had similar experience as well. They upgraded their client and server software something like 5 years ago which put forward different restrictions on character set used for password. I have used a special character which was no longer allowed. When I needed to restore files after disk failure I could not log in either in the app or on the website. The customer service was useless -- we are sorry, your fault. I have lost 1 TB of personal photos due to this as a paying customer. Never trust Backblaze.
I have the same experience with Backblaze. 3 years ago I tried to restore my files from Backblaze, using their desktop client.
First thing I noticed is that if it can't download a file due to network or some other problem then it just skips it. But you can force it to retry by modifying its job file which is just an SQLite DB. Also it stores and downloads files by splitting them into small chunks. It stores checksums of these chunks, but it doesn't store the complete checksum of the file, so judging by how badly the client is written I can't be sure that restored files are not corrupted after the stitching.
Then I found out that it can't download some files even after dozens of retries because it seems they are corrupted on Backblaze side.
But the most jarring issue for me is that it mangled all non-ascii filenames. They are stored as UTF-8 in the DB, but the client saves them as Windows-1252 or something. So I ended up with hundreds of gigabytes of files with names like фикац, and I can't just re-encode these names back, because some characters were dropped during the process.
I wanted to write a script that forces Backblaze Client to redownload files, logs all files that can't be restored, fixes the broken names and splits restored files back into chunks to validate their checksums against the SQLite DB, but it was too big of a task for me, so I just procrastinated for 3 years, while keeping paying monthly Backblaze fees because it's sad to let go of my data.
Do you have any more details? This is a pretty big deal. The differentiators between Backblaze and Hetzner mostly boil down to this kind of thing supposedly not being possible.
I’m on my phone so forgive the formatting, but here’s my entire support exchange:
- - -
Hey, I tried restoring a file from my backup — downloading it directly didn't work, and creating a restore with it also failed – I got an email telling me contract y'all about it.
Can you explain to me what happened here, and what can I do to get my file(s?) back?
- - -
Hi Jan,
Thanks for writing in!
I've reached out to our engineers regarding your restore, and I will get back to you as soon as I have an update. For now, I will keep the ticket open.
- - -
Hi Jan,
Regarding the file itself - it was deleted back in 2022, but unfortunately, the deletion never got recorded properly, which made it seem like the file still existed.
Thus, when you tried to restore it, the restoration failed, as the file doesn't actually exist anymore. In this case, it shouldn't have been shown in the first place.
For that, I do apologize. As compensation, we've granted you 3 monthly backup credits which will apply on your next renewal. Please let me know if you have any further questions.
- - -
That makes me even more confused to be honest - I’ve been paying for forever history since January 2022 according to my invoices?
Do you know how/when exactly it got deleted?
- - -
Hi Jan,
Unfortunately, we don't have that information available to us. Again, I do apologize.
- - -
I really don’t want to be rude, but that seems like a very serious issue to me and I’m not satisfied with that response.
If I’m paying for a forever backup, I expect it to be forever - and if some file got deleted even despite me paying for the “keep my file history forever” option, “oh whoops sorry our bad but we don’t have any more info” is really not a satisfactory answer.
I don’t hold it against _you_ personally, but I really need to know more about what happened here - if this file got randomly disappeared, how am I supposed to trust the reliability of anything else that’s supposed to be safely backed up?
- - -
Hi Jan,
I'll inquire with our engineers tomorrow when they're back in, and I'll update you as soon as I can. For now, I will keep the ticket open.
- - -
Appreciate that, thank you! It’s fine if the investigation takes longer, but I just want to get to the bottom of what happened here :)
- - -
Hi Jan,
Thanks for your patience.
According to our engineers and my management team:
With the way our program logs information, we don't have the specific information that explains exactly why the file was removed from the backup. Our more recent versions of the client, however, have vastly improved our consistency checks and introduced additional protections and audits to ensure complete reliability from an active backup.
Looking at your account, I do see that your backup is currently not active, so I recommend running the Backblaze installer over your current installation to repair it, and inherit your original backup state so that our updates can check your backup.
I do apologize, and I know it's not an ideal answer, but unfortunately, that is the extent of what we can tell you about what has happened.
- - -
I gave up escalating at this point and just decided these aren’t trusted anymore.
The files in question are four year old at this point so it’s hard for me conclusively state, so I guess there might be a perfect storm of that specific file being deleted because it was due to expire before upgraded to “keep history forever”, but I don’t think it’s super likely, and I absolutely would expect them to have telemetry about that in any case.
If anyone from Backblaze stumbles upon it and wants to escalate/reinvestigate, the support ID is #1181161.
I noticed this (thankfully before it was critical) and I’ve decided to move on from BB. Easily over 10 year customer. Totally bogus. Not only did it stop backing it up the old history is totally gone as well.
The one thing they have to do is backup everything and when you see it in their console you can rest assured they are going to continue to back it up.
They’ve let the desktop client linger, it’s difficult to add meaningful exceptions. It’s obvious they want everyone to use B2 now.
Not OP, but I have been using borg backup [1] against Hetzner Storage Box [2]
Borg backup is a good tool in my opinion and has everything that I need (deduplication, compression, mountable snapshot.
Hetzner Storage Box is nothing fancy but good enough for a backup and is sensibly cheaper for the alternatives (I pay about 10 eur/month for 5TB of storage)
Before that I was using s3cmd [3] to backup on a S3 bucket.
I have used Arq for way over a decade. It does incremental encrypted backups and supports a lot of storage providers. Also supports S3 object lock (to protect against ransomware). It’s awesome!
(Arq developer here) By default Arq tries to be unobtrusive. Edit your backup plan and slide the “CPU usage” slider all the way to the right to make it go faster.
At some point, Backblaze just silently stopped backing up my encrypted (VeraCrypt) drives. Just stopped working without any announcement, warning or notification. After lots of troubleshooting and googling I found out that this was intentional from some random reddit thread. I stopped using their backup service after that.
Some companies are in the business of trust. These companies NEED to understand that trust is somewhat difficult to earn, but easy to lose and nearly IMPOSSIBLE to regain. After reading this article I will almost certainly never use or recommend Backblaze. (And while I don't use them currently, they WERE on the list of companies I would have recommended due to the length of their history.)
It looks like the following line has been added to /Library/Backblaze.bzpkg/bzdata/bzexcluderules_mandatory.xml which excludes my Dropbox folder from getting backed up:
That is the exact path to my Dropbox folder, and I presume if I move my Dropbox folder this xml file will be updated to point to the new location. The top of the xml file states "Mandatory Exclusions: editing this file DOES NOT DO ANYTHING".
.git files seem to still be backing up on my machine, although they are hidden by default in the web restore (you must open Filters and enable Show Hidden Files). I don't see an option to show hidden files/folders in the Backblaze Restore app.
After mucking around with various easy to use options my lack of trust[1] pushed me into a more-complicated-but-at-least-under-my-control-option: syncthing+restic+s3 compatible cloud provider.
Basically it works like this:
- I have syncthing moving files between all my devices. The larger the device, the more stuff I move there[2]. My phone only has my keepass file and a few other docs, my gaming PC has that plus all of my photos and music, etc.
- All of this ends up on a raspberry pi with a connected USB harddrive, which has everything on it. Why yes, that is very shoddy and short term! The pi is mirrored on my gaming PC though, which is awake once every day or two, so if it completely breaks I still have everything locally.
- Nightly a restic job runs, which backs up everything on the pi to an s3 compatible cloud[3], and cleans out old snapshots (30 days, 52 weeks, 60 months, then yearly)
- Yearly I test restoring a random backup, both on the pi, and on another device, to make sure there is no required knowledge stuck on there.
This is was somewhat of a pain to setup, but since the pi is never off it just ticks along, and I check it periodically to make sure nothing has broken.
[1] there is always weirdness with these tools. They don't sync how you think, or when you actually want to restore it takes forever, or they are stuck in perpetual sync cycles
[2] I sync multiple directories, broadly "very small", "small", "dumping ground", and "media", from smallest to largest.
[3] Currently Wasabi, but it really doens't matter. Restic encrypts client side, you just need to trust the provider enough that they don't completely collapse at the same time that you need backups.
I once had to restore around 2 TB of RAW photos.
The app was a mess. It crashed every few hours. I ended up manually downloading single folders over a timespan of 2 weeks to restore my data. The support only apologized and could not help with my restore problem. After this I cancelled my subscription immediately and use local drives for my backups now, drives which I rotate (in use and locations).
> My first troubling discovery was in 2025, when I made several errors then did a push -f to GitHub and blew away the git history for a half decade old repo. No data was lost, but the log of changes was.
I know this is besides the point somewhat, but: Learn your tools people. The commit history could probably have been easily restored without involving any backup. The commits are not just instantly gone.
Indeed, the commits and blobs might even have still been available on the GitHub remote, I'm not sure they clean them on some interval or something, but bunch of stuff you "delete" from git still stays in the remote regardless of what you push.
Weirdly, reading this had the net impact of me signing up to Backblaze.
I had no idea that it was such a good bargain. I used to be a Crashplan user back in the day, and I always thought Backblaze had tiered limits.
I've been using Duplicati to sync a lot of data to S3's cheapest tape-based long term storage tier. It's a serious pain in the ass because it takes hours to queue up and retrieve a file. It's a heavy enough process that I don't do anything nearly close to enough testing to make sure my backups are restorable, which is a self-inflicted future injury.
Here's the thing: I'm paying about $14/month for that S3 storage, which makes $99/year a total steal. I don't use Dropbox/Box/OneDrive/iCloud so the grievances mentioned by the author are not major hurdles for me. I do find the idea that it is silently ignoring .git folders troubling, primarily because they are indeed not listed in the exclusion list.
I am a bit miffed that we're actively prevented from backing up the various Program Files folders, because I have a large number of VSTi instruments that I'll need to ensure are rcloned or something for this to work.
I can almost almost understand the logic behind not backing up OneDrive/Dropbox. I think it's bad logic but I can understand where it's coming from.
Not backing up .git folders however is completely unacceptable.
I have hundreds of small projects where I use git track of history locally with no remote at all. The intention is never to push it anywhere. I don't like to say these sorts of things, and I don't say it lightly when I say someone should be fired over this decision.
I found out the hard way that backblaze just deletes backed up data from external hard drives that haven't been connected in a while. I had like 2TB total.
I just looked in my Backblaze restore program, and all my .git folders are in there. I did have to go to the Settings menu and toggle an option to show hidden files. This is the Mac version.
I think the target of the anger here should be (at least in part): OneDrive.
My understanding is that a modern, default onedrive setup will push all your onedrive folder contents to the cloud, but will not do the same in reverse -- it's totally possible to have files in your cloud onedrive, visible in your onedrive folder, but that do not exist locally. If you want to access such a file, it typically gets downloaded from onedrive for you to use.
If that's the case, what is Backblaze or another provider to do? Constantly download your onedrive files (that might have been modified on another device) and upload them to backblaze? Or just sync files that actually exist locally? That latter option certainly would not please a consumer, who would expect the files they can 'see' just get magically backed up.
It's a tricky situation and I'm not saying Backblaze handled it well here, but the whole transparent cloud storage situation thing is a bit of a mess for lots of people. If Dropbox works the same way (no guaranteed local file for something you can see), that's the same ugly situation.
Most have pointed out that the OneDrive exclusion makes sense due to its complexity. But I see no one here defending the undocumented .git exclusion. That’s pretty egregious - if I’m backing up that directory it’s always 100% intentional and it definitely feels like a sacrifice to the product functionality for stability and performance. Not documenting it just twists the knife.
Ironically drop box and one drive folders I can still somewhat understand as they are "backuped" in other ways (but potentially not reliable so I also understand why people do not like that).
But .git? It does not mean you have it synced to GitHub or anything reliable?
If you do anything then only backup the .git folder and not the checkout.
But backing up the checkout and not the .git folder is crazy.
normally this folder are synced to dropbox and/or onedrive
both services have internal backups to reduce the chance they lose data
both services allow some limited form of "going back to older version" (like the article states itself).
Just because the article says "sync is not backup" doesn't mean that is true, I mean it literally is backup by definition as it: makes a copy in another location and even has versioning.
It's just not _good enough_ backup for their standards. Maybe even standards of most people on HN, but out there many people are happy with way worse backups, especially wrt. versioning for a lot of (mostly static) media the only reason you need version rollback is in case of a corrupted version being backed up. And a lot of people mostly backup personal photos/videos and important documents, all static by nature.
Through
1. it doesn't really fulfill the 3-2-1 rules it's only 2-1-1 places (local, one backup on ms/drop box cloud, one offsite). Before when it was also backed up to backblaze it was 3-2-1 (kinda). So them silently stopping still is a huge issue.
2. newer versions of the 3-2-1 rule also say treat 2 not just as 2 backups, but also 2 "vendors/access accounts" with the one-drive folder pretty much being onedrive controlled this is 1 vendor across local and all backups. Which is risky.
> You are using it to mean "maintaining full version history", I believe?
No, they are using it to mean “backed up”. Like, “if this data gets deleted or is in any way lost locally, it’s still backed remotely (even years later, when finally needed)”.
I’m astonished so many people here don’t know what a backup is! No wonder it’s easy for Backblaze to play them for fools.
definition of the term backup by most sources is one the line of:
> a copy of information held on a computer that is stored separately from the computer
there is nothing about _any_ versioning, or duration requirements or similar
To use your own words, I fear its you who doesn't know what a backup is and assume a lot other additional (often preferable(1)) things are part of that term.
Which is a common problem, not just for the term backup.
There is a reason lawyers define technical terms in a for this contract specific precise way when making contracts.
Or just requirements engineering. Failing there and you might end up having a backup of all your companies important data in a way susceptible to encrypting your files ransomware or similar.
---
(1): What often is preferable is also sometimes the think you really don't want. Like sometimes keeping data around too long is outright illegal. Sometimes that also applies to older versions only. And sometimes just some short term backups are more then enough for you use case. The point here is the term backup can't mean what you are imply it does because a lot of existing use cases are incompatible with it.
Microsoft makes no guarantees on onedrive, you are responsible for backing up that data. Of course they try hard to keep it safe, but contractually they give no promises
Unrelated to the main point, and probably too late to matter, but you can access repo activity logs via Github's API. I had to clean up a bad push before and was able to find the old commit hash in the logs, then reset the branch to that commit, similarly to how you'd fix local messes using reflog.
Use restic with resticprofile and you won't need anything else. Point it to a Hetzner storagebox, the best value you can get. Don't trust fisher price backup plans
I think this is a risk with anything that promotes itself as "unlimited", or otherwise doesn't specify concrete limits. I'm always sceptical of services like this as it feels like the terms could arbitrarily change at any point, as we've found out here.
(as a side note, it's funny to see see them promoting their native C app instead of using Java as a "shortcut". What I wouldn't give for more Java apps nowadays)
I was always roughly of the mind that Backblaze was just too close to the "If it's too good to be true it probably is", seems like that may have been a good decision.
I’ve been using it for years, and the one time I needed to restore a file, I realized that VMware VMs files were excluded from the backup. They are so many exclusion that I start doing physical backup again.
I think this should not be attributed to malice, however unfortunate. I had also developed some sync app once and onedrive folders were indeed problematic, causing cyclic updates on access and random metadata changes for no explicit reason.
Complete lack of communication (outside of release notes, which nobody really reads, as the article too states) is incompetence and indeed worrying.
Just show a red status bar that says "these folders will not be backed up anymore", why not?
If the constant meta changes (or other peculiarities involving those folders) make the sync unusable, then it can be both. In that case, you stop syncing and communicate.
So my idea is that it's a competency problem (lack of communication), not malice. But it's just a theory, based on my own experience.
In any case, this is a bad situation, however you look at it.
On the topic of backing up data from cloud platforms such as Onedrive, I suspect this is stop the client machine from actively downloading 'files on demand' which are just pointers in explorer until you go to open them.
If you've got huge amounts of files in Onedrive and the backup client starts downloading everyone of them (before it can reupload them again) you're going to run into problems.
This is a pain, to be sure, but surely there is some sort of logic you could implement to detect whether a file is a Real File that actually exists on the device (if so, back it up) or a pointer to the cloud (ignore it by default, probably, but maybe provide a user setting to force it to back up even these)
It used to be the case that placeholder files were very obvious but now OneDrive and iCloud (possibly others) work more like an attached network storage with some local cache, and that was a good move for most programs because back then a file being evicted from storage looked like a file deletion.
Came here to say this. Files in OneDrive get removed from your local storage and are downloaded ON DEMAND. given that you can have 1TB+ onedrive folder, backblaze downloading all of that is gonna throttle your connection and fill up your disk real fast.
I assume when asking such a question, you expect an honest answer like mine:
rclone is my favorite alternative. Supports encryption seamlessly, and loaded with features. Plus I can control exactly what gets synced/backed up, when it happens, and I pay for what I use (no unsustainable "unlimited" storage that always comes with annoying restrictions). There's never any surprises (which I experienced with nearly every backup solution). I use Backblaze B2 as the backend. I pay like $50 a month (which I know sounds high), but I have many terabytes of data up there that matters to me (it's a decade or more of my life and work, including long videos of holidays like Christmas with my kids throughout the years).
For super-important stuff I keep a tertiary backup on Glacier. I also have a full copy on an external harddrive, though those drives are not very reliable so I don't consider it part of the backup strategy, more a convenience for restoring large files quickly.
I feel that's a systemic problem with all consumer online-backup software: They often use the barest excuse to not back things up. At best, it's to show a fast progress bar to the average user, and at worst it's to quietly renege on the "unlimited" capacity they promised when they took your money. [1]
Trying to audit—let alone change—the finer details is a pain even for power users, and there's a non-zero risk the GUI is simply lying to everybody while undocumented rules override what you specified.
When I finally switched my default boot to Linux, I found many of those offerings didn't support it, so I wrote some systemd services around Restic + Backblaze B2. It's been a real breath of fresh air: I can tell what's going on, I can set my own snapshot retention rules, and it's an order of magnitude cheaper. [2]
____
[1] Along the lines of "We have your My Documents. Oh, you didn't manually add My Videos or My Music for every user? Too bad." Or in some cases, certain big-file extensions are on the ignore list by default for no discernible reason.
[2] Currently a dollar or two a month for ~200gb. It doesn't change very much, and data verification jobs redownload the total amount once a month. I don't backn up anything I could get from elsewhere, like Steam games. Family videos are in the care of different relatives, but I'm looking into changing that.
With restic I don't need some kind of special server daemon on the other end, I can point my backup destination to any mountable filesystem, or relatively dumb "bucket" stores like S3 or B2. I like having the sense of options and avoiding lock-in. [1]
As for GUIs in general... Well, like I said, I just finished several years of bad experiences with some proprietary ones, and I wanted to see and choose what was really going on.
At this point, I don't think I'd ever want a GUI beyond a basic status-reporting widget. It's not like I need to regularly micromanage the folder-set, especially when nobody else is going to tweak it by surprise.
_____
[1] The downside to the dumb-store is a ransomware scenario, where the malware is smart enough to go delete my old snapshots using the same connection/credentials. Enforcing retention policies on the server side necessarily needs a smarter server. B2 might actually have something useful there, but I haven't dug into it.
Yes, you're exactly right. Once they decide not to exclude certain filetypes it puts the burden on the endusers who are unequipped to monitor these changes.
This is really disturbing to hear as I've incorporated B2 into a lot of my flow for backups as well as a storage backend for Nextcloud and planned as the object store for some upcoming archival storage products I'm working on.
I know the post is talking about their personal backup product but it's the same company and so if they sneak in a reduction of service like this, as others have already commented, it erodes difficult-to-earn trust.
I like backblaze for backups, but I use restic and b2. You get what you pay for. Really lame behavior from backblaze as I always recommended their native backup solution to others and now need to reconsider.
That's pretty crazy because I just set up personal backups with a different service (rsync.net, I was already using it for WP website backups) and my git folders were literally my first priority
Hetzner storagebox. 1TB for under 5 bucks/month, 5TB for under 15. Sftp access. Point your restic there. Backup game done, no surprises, no MBAs involved.
Selfhosting Offsite is hard.
Accessing services via standard protocols like ssh/webdav and just pushing your encrypted blobs there is a good middle ground.
They can't control what you upload, and you can easily point your end-point somewhere else if you need to move.
Same, I use Restic + Backrest (plus monitoring on Healthchecks, self-hosted + Prometheus/AlertManager/Pushover), with some decent structure - local backups every half-an-hour to raid1, every hour a backup to my old NAS, every day a backup to FTP in Helsinki, and once a week some backups to Backblaze (via Restic). Gives me local backups, observability, remote backups spread across different providers - seems quite safe :) I highly recommend to everyone figuring out a good backup strategy, takes a day or two.
Edit: on top of that I've built a custom one-page monitoring dashboard, so I see everything in one place (https://imgur.com/B3hppIW) - I'll opensource, it's decent architecture, I just need to cleanup some secrets from Git history...
What cloud backend are people using for restic? B2/S3/something else? I'm still just backing up to other machines using it (though I'd also heavily recommend restic)
I run restic with rclone, which is not only compatible with S3-like storage (which include many, like Hetzner, OVH, Exoscale) but many others, from mega to pcloud through Google Drive.
For stuff I care about (mostly photos), I back them up on two different services. I don't have TBs of those, so it's not very expensive. My personal code I store on git repositories anyway (like SourceHut or Codeberg or sometimes GitHub).
If this is true, I'll need to stop using Backblaze. I have been relying on them for years. If I had discovered this mid-restore, I think I would have lost my mind.
The article links to a statement made by Backblaze:
"The Backup Client now excludes popular cloud storage providers [...] this change aligns with Backblaze’s policy to back up only local and directly connected storage."
I guess windows 10 and 11 users aren't backing up much to Backblaze, since microsoft is tricking so many into moving all of their data to onedrive.
Not backing up cloud is a good default. I have had people complain about performance when they connected to our multiple TB shared drive because their backup software fetched everything. There are of course reasons to back that up I am not belittling that, but not for people who want temporary access to some 100GB files i.e. most people in my situation.
I only use Backblaze as a cold storage service so this doesn't affect me but it's worth knowing about changes in the delivery of their other services as it might become widespread
This is terrifying. Aren't Backblaze users paying per-GB of storage/transfer? Why should it matter what's being stored, as long as the user is paying the costs? This will absolutely result in permanent data loss for some subset of their users.
I hope Backblaze responds to this with a "we're sorry and we've fixed this."
* S3 is super expensive, unless you use Glacier, but that has a high overhead per file, so you should bundle them before uploading.
* If your value your privacy, you need to encrypt the files on the client before uploading.
* You need to keep multiple revisions of each file, and manage their lifecycle. Unless you're fine with losing any data that was overwritten at the time of the most recent backup.
* You need to de-duplicate files, unless you want bloat whenever you rename a file or folder.
* Plus you need to pay for Amazon's extortionate egress prices if you actually need to restore your data.
I certainly wouldn't want to handle all that on my own in a script. What can make sense is using open source backup software with S3/R2/B2 as backing storage.
I like how you can set multiple keys (much like LUKS) so that the key used by scheduled backups can be changed without messing with the key that I have memorized to restore with when disaster strikes.
It also means you can have multiple computers backing up (sequentially, not simultaneously) to the same repository, each with their own key.
I recently stopped using Backblaze after a decade because it was using over 20GB of RAM on my machine. I also realized that I mostly wanted it for backing up old archival data that doesn’t change ever really. So I created a B2 bucket and uploaded a .tar.xz file.
ANY company, and I do mean any, that offers "unlimited" anything is 100% a scam. At best its a temporary growth hack to entice people who havent had technology rug-pulls. And when profits dwindle and the S curve is near the upper coast, you can guarantee that "unlimited" will get hidden restrictions, exclusions, "terms of service" changes, nebulous fair use policies that arent fair, and more dark patterns. And every one of them are "how do we worsen unlimited to make more money on captive customers?"
We're also seeing this play out in real time with Anthropic with their poop-splatter-llm. They've gone through like 4 rug-pulls, and people STILL pay $200/month for it. Every round, their unlimited gets worse and worse, like I outlined above.
Pay as you go is probably the more fair. But SaaS providers reallllllly hate in providing direct and easy to use tools to identify costs, or <gasp> limit the costs. A storag/backup provider could easily show this. LLM providers could show a near-realtime token utilization.
But no. Dark patterns, rug-pulls, and "i am altering the deal, pray i do not alter it further".
My naive idea:
Download 100 TB every 3 month to a 2nd device, create a list of files restored, validate checksums with the original machine, make a list of files differing and missing, check which ones are supposed to be missing? That sounds like a full time job.
Holy Hannah, this is such bullshit from Backblaze. Both the .git directory (why would I not SPECIFICALLY want this backed up for my projects?) and the cloud directories.
I get that changing economics make it more difficult to honor the original "Backup Everything" promise but this feels very underhanded. I'll be cancelling.
Blackblaze's personal backup solution is a mess in general. The client is clearly a giant pile of spaghetti code and I've had numerous issues with it, trying to figure out and change which files it does and doesn't backup is just one of them.
The configuration and logging formats they use are absolutely nonsensical.
I've recently been looking for online backup providers and Backblaze came highly recommended to me - but I think after reading this article I'll look elsewhere because this kind of behavior seems like the first step on the path of enshittification.
Surprisingly only the headings (2.05) and links (3.72) fail the Firefox accessibility check, the body text is 5.74. But subjectively it seems worse and I definitely agree with you that the contrast is too low.
Contrast looks good for the text, but the font used has very thin lines. A thicker font would have been readable by itself. At 250% page zoom it's good enough, if you don't enable the browser built-in reader mode.
I wonder if it's because of the font-weight being decreased. If I disable the `font-weight` rule in Firefox's Inspector the text gets noticeably darker, but the contrast score doesn't change. Could be a bad interaction with anti-aliasing thin text that the contrast checker isn't able to pick up.
I'd say it looks pretty readable on android although I still wouldn't describe it as good. I wouldn't say I feel encouraged to squint. But possibly different antialiasing explains it.
Like this, by word of mouth. That’s how Apple has done UI design since they stopped printing paper manuals.
- ctrl-shift-. to show hidden files on macOS
- pull down to see search box (iOS 18)
- swipe from top right corner for flashlight button
- swipe up from lower middle for home screen
Not restricted to Apple, but TIL: Double-clicking on a word an keeping the second click pressed, then dragging, allows you to select per word instead of per character.
> I have a gesture for whoever decided "find in page" should go under share.
You can also just type your search term into the normal address bar and there's an item at the bottom of the list for "on this page - find <search>". I'd never even seen the find-in-page button under share.
I found this to be a common theme in web design a while back, and in part led to an experiment developing a newspaper/Pocket-like interface to reading HN. It's not perfect, but is easier on the eyes for reading... https://times.hntrends.net/story/47762864
I'm also pretty sure 14 points font is a bit outdated at this point, 16 should probably be a minimum with current screens. It's not as if screens aren't wide enough to fit bigger text.
Haha I keep forgetting that. Fortunately the browser remembers my zoom settings per page. I'm pretty sure the font is now at 16 or something via repeated Cmd +.
10 point at 96 dpi or with correctly applied scaling is very readable. But some toolkits like GTK have huge paddings for their widgets, so the text will be readable, but you’ll lose density.
macOS/iOS Safari and Brave browsers have "Reader mode" . Chrome has a "Reading mode" but it's more cumbersome to use because it's buried in a side menu.
For desktop browsers, I also have a bookmarklet on the bookmarks bar with the following Javascript:
It doesn't darken the text on every webpage but it does work on this thread's article. (The Javascript code can probably be enhanced with more HTML heuristics to work on more webpages.)
One day try throwing a pair on you'll be surprised. The small thin font is causing this not the text contrast. This and low light scenarios are the first things to go.
Your iPhone has this cool feature called reader mode if you didn’t know.
As for mentioning WCAG - so what if it doesn’t adhere to those guidelines? It’s his personal website, he can do what he wants with it. Telling him you found it difficult to read properly is one thing but referencing WCAG as if this guy is bound somehow to modify his own aesthetic preference for generic accessibility reasons is laughable. Part of what continues to make the web good is differing personal tastes and unique website designs - it is stifling and monotonous to see the same looking shit on every site and it isn’t like there aren’t tools (like reader mode) for people who dislike another’s personal taste.
You can easily read it. If reading the article got you attention + imaginary HN points and complaining didn’t, I’m willing to bet you’d find a way to do the former without doing the latter.
Managing backup exclusions strikes again. It's impossible. Either commit to backing up the full disk, including the 80% of easily regenerated/redownloaded etc. data, or risk the 0.001% critical 16 byte file that turns out to contain your Bitcoin wallet key or god knows what else. I've been bitten by this more times than I'd like to admit managing my own backups, it's hard to expect a shrink-wrapped provider to do much better. It only takes one dumb simplification like "my Downloads folder is junk, no need to back that up" combined with (no doubt, years later) downloading say a 1Password recovery PDF that you lazily decide will live in that folder, and the stage is set.
Pinning this squarely on user error. Backblaze could clearly have done better, but it's such a well known failure mode that it's not much far off refusing to test restores of a bunch of tapes left in the sun for a decade.
It isn't user error if it was working perfectly fine until the provider made a silent change.
Unless the user error you are referring to is not managing their own backups, like I do. Though this isn't free from trouble, I once had silent failures backing up a small section of my stuff for a while because of an ownership/perms snafu and my script not sending the reports to stderr to anywhere I'd generally see them. Luckily an automated test (every now & then it scans for differences in the whole backup and current data) because it could see the source and noticed a copy wasn't in the latest snapshot on the far-away copy. Reliable backups is a harder problem then most imagine.
If there is a footgun I haven't considered yet in backup exclusions, I'd like to know more. Shouldn't it be safe to exclude $XDG_CACHE_HOME? Unfortunately, since many applications don't bother with the XDG standard, I have to exclude a few more directories, so if you have any stories about unexpected exclusions, would you mind sharing?
I don't remember why I started doing it, but I don't bulk exclude .cache for some reason or other. I have a script that strips down larger known caches as part of the backup. But the logic, whatever it was, is easy to understand: you're relying on apps to correctly categorise what is vs. isn't cache.
Also consider e.g. ~/.cache/thumbnails. It's easy to understand as a cache, but if the thumbnails were of photos on an SD card that gets lost or immediately dies, is it still a cache? It might be the only copy of some once-in-a-lifetime event or holiday where the card didn't make it back with you. Something like this actually happened to me, but in that case, the "cache" was a tarball of an old photo gallery generated from the originals that ought to have been deleted.
It's just really hard to know upfront whether something is actually important or not. Same for the Downloads folder. Vendor goes bankrupt, removes old software versions, etc. The only safe thing you can really do is hold your nose and save the whole lot.
Hell, if I open a directory of photos and my OS tries to pull exif data for each one, it would be wild if that caused those files to be fully downloaded and consume disk space.
- report an error
- ignore
- materialize
Regardless, if you make it back up software that doesn’t give this level of control to users, and you make a change about which files you’re going to back up, you should probably be a lot more vocal with your users about the change. Vanishingly few people read release notes.
And, as a separate note, they shouldn't be balking at the amount of data in a virtualized onedrive or dropbox either considering the user could get a many-terabyte hard drive for significantly less money.
The moment you call read() (or fopen() or your favorite function), the download will be triggered. It's a hook sitting between you and the file. You can't ignore it.
The only way to bypass it is to remount it over rclone or something and use "ls" and "lsd" functions to query filenames. Otherwise it'll download, and it's how it's expected to work.
It shouldn't stress things to spend a couple weeks relaying a terabyte in small chunks. The most likely strain is on my upload bandwidth and yeah that's the cost of cloud backup, more ISPs need to improve upload.
> more ISPs need to improve upload.
I was yelling the same things to the void for the longest time, then I had a brilliant idea of reading the technical specs of the technology coming to my home.
Lo and behold, the numbers I got were the technical limits of the technology that I had at home (PON for the time being), and going higher would need a very large and expensive rewiring with new hardware and technology.
> the technical limits of the technology that I had at home (PON for the time being)
Isn't that usually symmetrical? Is yours not?
Depends on your device capacity and how much is in actual use. Wear leveling things also wear things while it moves things around.
> For something you'll need to do once or twice ever?
I don't know you, but my cloud storage is living, and even if it's not living, if the software can't smartly ignore files, it'll pull everything in, compare and pass without uploading, causing churns in every backup cycle.
> Isn't that usually symmetrical? Is yours not?
GPON (Gigabit PON) is asymmetric. Theoretical limits is 2.4Gbps down, 1.2Gbps up. I have 1000Mbit/75Mbit at home.
It would be reasonable to say that if you run the file sync in a mode that keeps everything locally, then Backblaze should be backing it up. Arguably they should even when not in that mode, but it'll churn files repeatedly as you stream files in and out of local storage with the cloud provider.
When you have a couple terabytes of data in that drive, is it acceptable to cycle all that data and use all that bandwidth and wear down your SSD at the same time?
Also, high number of small files is a problem for these services. I have a large font collection in my cloud account and oh boy, if I want to sync that thing, the whole thing proverbially overheats from all the queries it's sending.
I assume you don’t think that, so I’m curious, what would you propose positively?
Yes, I didn't technically said that.
> It sounds like you are arguing it is impossible to backup files in Dropbox in any reasonable way, and therefore nobody should backup their cloud files.
I don't argue neither, either.
What I said is with "on demand file download", traditional backup software faces a hard problem. However, there are better ways to do that, primary candidate being rclone.
You can register a new application ID for your rclone installation for your Google Drive and Dropbox accounts, and use rclone as a very efficient, rsync-like tool to backup your cloud storage. That's what I do.
I'm currently backing up my cloud storages to a local TrueNAS installation. rclone automatically hash-checks everything and downloads the changed ones. If you can mount Backblaze via FUSE or something similar, you can use rclone as an intelligent MITM agent to smartly pull from cloud and push to Backblaze.
Also, using RESTIC or Borg as a backup container is a good idea since they can deduplicate and/or only store the differences between the snapshots, saving tons of space in the process, plus encrypting things for good measure.
So, in practice, you shouldn't have to download the whole remote drive when you do an incremental backup.
Interestingly, rclone supports that on many providers, but to be able to backblaze support that, it needs to integrate rclone, connect to the providers via that channel and request checks, which is messy, complicated, and computationally expensive. Even if we consider that you won't be hitting API rate limits on the cloud provider.
This is another example in disguise of two people disagreeing about what "unlimited" means in the context of backup, even if they do claim to have "no restrictions on file type or size" [2].
[1] https://www.reddit.com/r/backblaze/comments/jsrqoz/personal_... [2] https://www.backblaze.com/cloud-backup/personal
Always prefer businesses who are upfront and honest about what they can offer their users, in a sustainable way.
Or that they're targeting the mass retail market, where people are technically ignorant, and "unlimited" is required to compete.
And statistically-speaking, is viable as long as a company keeps its users to a normal distribution.
Once growth slows, churn eats much of the organic growth and you need to spend money on marketing.
Doing a bait-and-switch on a percentage of your paying customers, no matter how small the percentage is, may be "viable" for the company, but it's a hostile experience for those users, and companies deserve to be called out for it.
So… Marketing has taken over, just as parent comment said. Got it.
I do wish it was a word that had to be completely dropped from marketing/adverting.
For example there is not unlimited storage, hell the visible universe has a storage limit. There is not unlimited upload and download speed, and what if when you start using more space they started exponentially slowing the speed you could access the storage? Unlimited CPU time in processing your request? Unlimited execution slots to process your request? Unlimited queue size when processing your requests.
Hence everything turns into the mess of assumptions.
Yes, indeed, most relevant in this case probably "time" and "bandwidth", put together, even if you saturate the line for a month, they won't throttle you, so for all intents and purposes, the "data cap" is unlimited (or more precise; there is no data cap).
Nobody has turned the moon into a hard drive yet.
I doubt they have those pipes, at least if every of their customers (or a sufficiently large amount) would actually make use of that.
Second question would be, how long they would allow you to utilize your broadband 24/7 at max capacity without canceling your subscription. Which leads back to the point the person I replied to was making: If you truly make use of what is promised, they cancel you. Hence it is not a faithful offer in the first place.
Not important here because backblaze only has to match the storage of your single device. Plus some extra versions but one year multiplied by upload speed is also a tractable amount.
Residential network access is oversold as everything else.
The only difference with storage is there’s a theoretical maximum on how much a single person can use.
But you could just as well limit backup upload speed for similar effect. Having something about fair use in ToS is really not that different.
Of course, in countries where the internet isn't so developed as in other parts of the world, this might make sense, but modern countries don't tend to do that, at least in my experience.
My parents have gotten hit by this. Dad was downloading huge video files at one point on his WiFi and his ISP silently throttled him.
A common term is "data cap": https://en.wikipedia.org/wiki/Data_cap
Wow, I knew that was generally true, didn't know it was true for internet access in the US too, how backwards...
> A common term is "data cap": https://en.wikipedia.org/wiki/Data_cap
I think most are familiar with throttling because most (all?) phone plans have some data cap at one point, but I don't think I've heard of any broadband connections here with data caps, that wouldn't make any sense.
If a company uses the word unlimited to describe their service, but then attempts to weasel out of it via their T&Cs, that doesn't constitute a disagreement over the meaning of the word unlimited. It just means the company is lying.
I don't quite understand why it's still like this; it's probably the biggest reason why git tends to play poorly with a lot of filesystem tools (not just backups). If it'd been something like an SQLite database instead (just an example really), you wouldn't get so much unnecessary inode bloat.
At the same time Backblaze is a backup solution. The need to back up everything is sort of baked in there. They promise to be the third backup solution in a three layer strategy (backup directly connected, backup in home, backup external), and that third one is probably the single most important one of them all since it's the one you're going to be touching the least in an ideal scenario. They really can't be excluding any files whatsoever.
The cloud service exclusion is similarly bad, although much worse. Imagine getting hit by a cryptoworm. Your cloud storage tool is dutifully going to sync everything encrypted, junking up your entire storage across devices and because restoring old versions is both ass and near impossible at scale, you need an actual backup solution for that situation. Backblaze excluding files in those folders feels like a complete misunderstanding of what their purpose should be.
Why should a file backup solution adapt to work with git? Or any application? It should not try to understand what a git object is.
I’m paying to copy files from a folder to their servers just do that. No matter what the file is. Stay at the filesystem level not the application level.
It's that to back up a folder on a filesystem, you need to traverse that folder and check every file in that folder to see if it's changed. Most filesystem tools usually assume a fairly low file count for these operations.
Git, rather unusually, tends to produce a lot of files in regular use; before packing, every commit/object/branch is simply stored as a file on the filesystem (branches only as pointers). Packing fixes that by compressing commit and object files together, but it's not done by default (only after an initial clone or when the garbage collector runs). Iterating over a .git folder can take a lot of time in a place that's typically not very well optimized (since most "normal" people don't have thousands of tiny files in their folders that contain sprawled out application state.)
The correct solution here is either for git to change, or for Backblaze to implement better iteration logic (which will probably require special handling for git..., so it'd be more "correct" to fix up git, since Backblaze's tools aren't the only ones with this problem.)
When I backup my computer the .git folders are among the most important things on there. Most of my personal projects aren't pushed to github or anywhere else.
Fortunately I don't use Backblaze. I guess the moral is don't use a backup solution where the vendor has an incentive to exclude things.
This is a joke, but honestly anyone here shouldn't be directly backing up their filesystems and should instead be using the right tool for the job. You'll make the world a more efficient place, have more robust and quicker to recover backups, and save some money along the way.
See Fossil (https://fossil-scm.org/)
P.S. There's also (https://www.sourcegear.com/vault/)
> SourceGear Vault Pro is a version control and bug tracking solution for professional development teams. Vault Standard is for those who only want version control. Vault is based on a client / server architecture using technologies such as Microsoft SQL Server and IIS Web Services for increased performance, scalability, and security.
It's the same reason why the postgres autovacuum daemon tends to be borderline useless unless you retune it[0]: the defaults are barmy. git gc only runs if there's 6700 loose unpacked objects[1]. Most typical filesystem tools tend to start balking at traversing ~1000 files in a structure (depends a bit on the filesystem/OS as well, Windows tends to get slower a good bit earlier than Linux).
To fix it, running
> git config --global gc.auto 1000
should retune it and any subsequent commit to your repo's will trigger garbage collection properly when there's around 1000 loose files. Pack file management seems to be properly tuned by default; at more than 50 packs, gc will repack into a larger pack.
[0]: For anyone curious, the default postgres autovacuum setting runs only when 10% of the table consists of dead tuples (roughly: deleted+every revision of an updated row). If you're working with a beefy table, you're never hitting 10%. Either tune it down or create an external cronjob to run vacuum analyze more frequently on the tables you need to keep speedy. I'm pretty sure the defaults are tuned solely to ensure that Postgres' internal tables are fast, since those seem to only have active rows to a point where it'd warrant autovacuum.
[1]: https://git-scm.com/docs/git-gc
> git config --global gc.auto 1000
with the long option name, and no `=`.
I contacted the support asking WTF, "oh the file got deleted at some point, sorry for that", and they offered me 3 months of credits.
I do not trust my Backblaze backups anymore.
First thing I noticed is that if it can't download a file due to network or some other problem then it just skips it. But you can force it to retry by modifying its job file which is just an SQLite DB. Also it stores and downloads files by splitting them into small chunks. It stores checksums of these chunks, but it doesn't store the complete checksum of the file, so judging by how badly the client is written I can't be sure that restored files are not corrupted after the stitching.
Then I found out that it can't download some files even after dozens of retries because it seems they are corrupted on Backblaze side.
But the most jarring issue for me is that it mangled all non-ascii filenames. They are stored as UTF-8 in the DB, but the client saves them as Windows-1252 or something. So I ended up with hundreds of gigabytes of files with names like фикац, and I can't just re-encode these names back, because some characters were dropped during the process.
I wanted to write a script that forces Backblaze Client to redownload files, logs all files that can't be restored, fixes the broken names and splits restored files back into chunks to validate their checksums against the SQLite DB, but it was too big of a task for me, so I just procrastinated for 3 years, while keeping paying monthly Backblaze fees because it's sad to let go of my data.
I wonder if they fixed their client since then.
- - -
Hey, I tried restoring a file from my backup — downloading it directly didn't work, and creating a restore with it also failed – I got an email telling me contract y'all about it.
Can you explain to me what happened here, and what can I do to get my file(s?) back?
- - -
Hi Jan,
Thanks for writing in!
I've reached out to our engineers regarding your restore, and I will get back to you as soon as I have an update. For now, I will keep the ticket open.
- - -
Hi Jan,
Regarding the file itself - it was deleted back in 2022, but unfortunately, the deletion never got recorded properly, which made it seem like the file still existed.
Thus, when you tried to restore it, the restoration failed, as the file doesn't actually exist anymore. In this case, it shouldn't have been shown in the first place.
For that, I do apologize. As compensation, we've granted you 3 monthly backup credits which will apply on your next renewal. Please let me know if you have any further questions.
- - -
That makes me even more confused to be honest - I’ve been paying for forever history since January 2022 according to my invoices?
Do you know how/when exactly it got deleted?
- - -
Hi Jan,
Unfortunately, we don't have that information available to us. Again, I do apologize.
- - -
I really don’t want to be rude, but that seems like a very serious issue to me and I’m not satisfied with that response.
If I’m paying for a forever backup, I expect it to be forever - and if some file got deleted even despite me paying for the “keep my file history forever” option, “oh whoops sorry our bad but we don’t have any more info” is really not a satisfactory answer.
I don’t hold it against _you_ personally, but I really need to know more about what happened here - if this file got randomly disappeared, how am I supposed to trust the reliability of anything else that’s supposed to be safely backed up?
- - -
Hi Jan,
I'll inquire with our engineers tomorrow when they're back in, and I'll update you as soon as I can. For now, I will keep the ticket open.
- - -
Appreciate that, thank you! It’s fine if the investigation takes longer, but I just want to get to the bottom of what happened here :)
- - -
Hi Jan,
Thanks for your patience.
According to our engineers and my management team:
With the way our program logs information, we don't have the specific information that explains exactly why the file was removed from the backup. Our more recent versions of the client, however, have vastly improved our consistency checks and introduced additional protections and audits to ensure complete reliability from an active backup.
Looking at your account, I do see that your backup is currently not active, so I recommend running the Backblaze installer over your current installation to repair it, and inherit your original backup state so that our updates can check your backup.
I do apologize, and I know it's not an ideal answer, but unfortunately, that is the extent of what we can tell you about what has happened.
- - -
I gave up escalating at this point and just decided these aren’t trusted anymore.
The files in question are four year old at this point so it’s hard for me conclusively state, so I guess there might be a perfect storm of that specific file being deleted because it was due to expire before upgraded to “keep history forever”, but I don’t think it’s super likely, and I absolutely would expect them to have telemetry about that in any case.
If anyone from Backblaze stumbles upon it and wants to escalate/reinvestigate, the support ID is #1181161.
The one thing they have to do is backup everything and when you see it in their console you can rest assured they are going to continue to back it up.
They’ve let the desktop client linger, it’s difficult to add meaningful exceptions. It’s obvious they want everyone to use B2 now.
Borg backup is a good tool in my opinion and has everything that I need (deduplication, compression, mountable snapshot.
Hetzner Storage Box is nothing fancy but good enough for a backup and is sensibly cheaper for the alternatives (I pay about 10 eur/month for 5TB of storage)
Before that I was using s3cmd [3] to backup on a S3 bucket.
[1] https://www.borgbackup.org/
[2] https://s3tools.org/s3cmd
[3] https://s3tools.org/s3cmd
</bzexclusions><excludefname_rule plat="mac" osVers="*" ruleIsOptional="f" skipFirstCharThenStartsWith="*" contains_1="/users/username/dropbox/" contains_2="*" doesNotContain="*" endsWith="*" hasFileExtension="*" />
That is the exact path to my Dropbox folder, and I presume if I move my Dropbox folder this xml file will be updated to point to the new location. The top of the xml file states "Mandatory Exclusions: editing this file DOES NOT DO ANYTHING".
.git files seem to still be backing up on my machine, although they are hidden by default in the web restore (you must open Filters and enable Show Hidden Files). I don't see an option to show hidden files/folders in the Backblaze Restore app.
That would be nice, they'd be able to get their history back!
Basically it works like this:
- I have syncthing moving files between all my devices. The larger the device, the more stuff I move there[2]. My phone only has my keepass file and a few other docs, my gaming PC has that plus all of my photos and music, etc.
- All of this ends up on a raspberry pi with a connected USB harddrive, which has everything on it. Why yes, that is very shoddy and short term! The pi is mirrored on my gaming PC though, which is awake once every day or two, so if it completely breaks I still have everything locally.
- Nightly a restic job runs, which backs up everything on the pi to an s3 compatible cloud[3], and cleans out old snapshots (30 days, 52 weeks, 60 months, then yearly)
- Yearly I test restoring a random backup, both on the pi, and on another device, to make sure there is no required knowledge stuck on there.
This is was somewhat of a pain to setup, but since the pi is never off it just ticks along, and I check it periodically to make sure nothing has broken.
[1] there is always weirdness with these tools. They don't sync how you think, or when you actually want to restore it takes forever, or they are stuck in perpetual sync cycles
[2] I sync multiple directories, broadly "very small", "small", "dumping ground", and "media", from smallest to largest.
[3] Currently Wasabi, but it really doens't matter. Restic encrypts client side, you just need to trust the provider enough that they don't completely collapse at the same time that you need backups.
I never trust them again with my data.
I know this is besides the point somewhat, but: Learn your tools people. The commit history could probably have been easily restored without involving any backup. The commits are not just instantly gone.
Indeed, the commits and blobs might even have still been available on the GitHub remote, I'm not sure they clean them on some interval or something, but bunch of stuff you "delete" from git still stays in the remote regardless of what you push.
I had no idea that it was such a good bargain. I used to be a Crashplan user back in the day, and I always thought Backblaze had tiered limits.
I've been using Duplicati to sync a lot of data to S3's cheapest tape-based long term storage tier. It's a serious pain in the ass because it takes hours to queue up and retrieve a file. It's a heavy enough process that I don't do anything nearly close to enough testing to make sure my backups are restorable, which is a self-inflicted future injury.
Here's the thing: I'm paying about $14/month for that S3 storage, which makes $99/year a total steal. I don't use Dropbox/Box/OneDrive/iCloud so the grievances mentioned by the author are not major hurdles for me. I do find the idea that it is silently ignoring .git folders troubling, primarily because they are indeed not listed in the exclusion list.
I am a bit miffed that we're actively prevented from backing up the various Program Files folders, because I have a large number of VSTi instruments that I'll need to ensure are rcloned or something for this to work.
Not backing up .git folders however is completely unacceptable.
I have hundreds of small projects where I use git track of history locally with no remote at all. The intention is never to push it anywhere. I don't like to say these sorts of things, and I don't say it lightly when I say someone should be fired over this decision.
My understanding is that a modern, default onedrive setup will push all your onedrive folder contents to the cloud, but will not do the same in reverse -- it's totally possible to have files in your cloud onedrive, visible in your onedrive folder, but that do not exist locally. If you want to access such a file, it typically gets downloaded from onedrive for you to use.
If that's the case, what is Backblaze or another provider to do? Constantly download your onedrive files (that might have been modified on another device) and upload them to backblaze? Or just sync files that actually exist locally? That latter option certainly would not please a consumer, who would expect the files they can 'see' just get magically backed up.
It's a tricky situation and I'm not saying Backblaze handled it well here, but the whole transparent cloud storage situation thing is a bit of a mess for lots of people. If Dropbox works the same way (no guaranteed local file for something you can see), that's the same ugly situation.
But .git? It does not mean you have it synced to GitHub or anything reliable?
If you do anything then only backup the .git folder and not the checkout.
But backing up the checkout and not the .git folder is crazy.
No they are not. This is explicitly addressed in the article itself.
both services have internal backups to reduce the chance they lose data
both services allow some limited form of "going back to older version" (like the article states itself).
Just because the article says "sync is not backup" doesn't mean that is true, I mean it literally is backup by definition as it: makes a copy in another location and even has versioning.
It's just not _good enough_ backup for their standards. Maybe even standards of most people on HN, but out there many people are happy with way worse backups, especially wrt. versioning for a lot of (mostly static) media the only reason you need version rollback is in case of a corrupted version being backed up. And a lot of people mostly backup personal photos/videos and important documents, all static by nature.
Through
1. it doesn't really fulfill the 3-2-1 rules it's only 2-1-1 places (local, one backup on ms/drop box cloud, one offsite). Before when it was also backed up to backblaze it was 3-2-1 (kinda). So them silently stopping still is a huge issue.
2. newer versions of the 3-2-1 rule also say treat 2 not just as 2 backups, but also 2 "vendors/access accounts" with the one-drive folder pretty much being onedrive controlled this is 1 vendor across local and all backups. Which is risky.
You are using it to mean "maintaining full version history", I believe? Another important consideration.
No, they are using it to mean “backed up”. Like, “if this data gets deleted or is in any way lost locally, it’s still backed remotely (even years later, when finally needed)”.
I’m astonished so many people here don’t know what a backup is! No wonder it’s easy for Backblaze to play them for fools.
> a copy of information held on a computer that is stored separately from the computer
there is nothing about _any_ versioning, or duration requirements or similar
To use your own words, I fear its you who doesn't know what a backup is and assume a lot other additional (often preferable(1)) things are part of that term.
Which is a common problem, not just for the term backup.
There is a reason lawyers define technical terms in a for this contract specific precise way when making contracts.
Or just requirements engineering. Failing there and you might end up having a backup of all your companies important data in a way susceptible to encrypting your files ransomware or similar.
---
(1): What often is preferable is also sometimes the think you really don't want. Like sometimes keeping data around too long is outright illegal. Sometimes that also applies to older versions only. And sometimes just some short term backups are more then enough for you use case. The point here is the term backup can't mean what you are imply it does because a lot of existing use cases are incompatible with it.
(as a side note, it's funny to see see them promoting their native C app instead of using Java as a "shortcut". What I wouldn't give for more Java apps nowadays)
Complete lack of communication (outside of release notes, which nobody really reads, as the article too states) is incompetence and indeed worrying.
Just show a red status bar that says "these folders will not be backed up anymore", why not?
So my idea is that it's a competency problem (lack of communication), not malice. But it's just a theory, based on my own experience.
In any case, this is a bad situation, however you look at it.
It almost seems like they’re taking it personally as some kind of intentionally slight against them.
Most users would not want Backblaze to back up other cloud synced directories. This default is sensible.
If you've got huge amounts of files in Onedrive and the backup client starts downloading everyone of them (before it can reupload them again) you're going to run into problems.
But ideally, they'd give you a choice.
Preferably cheap and rclone compatible.
Hetzner storagebox sounds good, what about S3 or Glacier-like options?
I assume when asking such a question, you expect an honest answer like mine:
rclone is my favorite alternative. Supports encryption seamlessly, and loaded with features. Plus I can control exactly what gets synced/backed up, when it happens, and I pay for what I use (no unsustainable "unlimited" storage that always comes with annoying restrictions). There's never any surprises (which I experienced with nearly every backup solution). I use Backblaze B2 as the backend. I pay like $50 a month (which I know sounds high), but I have many terabytes of data up there that matters to me (it's a decade or more of my life and work, including long videos of holidays like Christmas with my kids throughout the years).
For super-important stuff I keep a tertiary backup on Glacier. I also have a full copy on an external harddrive, though those drives are not very reliable so I don't consider it part of the backup strategy, more a convenience for restoring large files quickly.
Trying to audit—let alone change—the finer details is a pain even for power users, and there's a non-zero risk the GUI is simply lying to everybody while undocumented rules override what you specified.
When I finally switched my default boot to Linux, I found many of those offerings didn't support it, so I wrote some systemd services around Restic + Backblaze B2. It's been a real breath of fresh air: I can tell what's going on, I can set my own snapshot retention rules, and it's an order of magnitude cheaper. [2]
____
[1] Along the lines of "We have your My Documents. Oh, you didn't manually add My Videos or My Music for every user? Too bad." Or in some cases, certain big-file extensions are on the ignore list by default for no discernible reason.
[2] Currently a dollar or two a month for ~200gb. It doesn't change very much, and data verification jobs redownload the total amount once a month. I don't backn up anything I could get from elsewhere, like Steam games. Family videos are in the care of different relatives, but I'm looking into changing that.
As for GUIs in general... Well, like I said, I just finished several years of bad experiences with some proprietary ones, and I wanted to see and choose what was really going on.
At this point, I don't think I'd ever want a GUI beyond a basic status-reporting widget. It's not like I need to regularly micromanage the folder-set, especially when nobody else is going to tweak it by surprise.
_____
[1] The downside to the dumb-store is a ransomware scenario, where the malware is smart enough to go delete my old snapshots using the same connection/credentials. Enforcing retention policies on the server side necessarily needs a smarter server. B2 might actually have something useful there, but I haven't dug into it.
I know the post is talking about their personal backup product but it's the same company and so if they sneak in a reduction of service like this, as others have already commented, it erodes difficult-to-earn trust.
On macOS.
Edit: on top of that I've built a custom one-page monitoring dashboard, so I see everything in one place (https://imgur.com/B3hppIW) - I'll opensource, it's decent architecture, I just need to cleanup some secrets from Git history...
For stuff I care about (mostly photos), I back them up on two different services. I don't have TBs of those, so it's not very expensive. My personal code I store on git repositories anyway (like SourceHut or Codeberg or sometimes GitHub).
backup to real s3 storage.
llms on real api tokens.
search on real search api no adverts.
google account on workspace and gcp, no selling the data.
etc.
only way to stop corpos treating you like a doormat
That's a good warning
> Backblaze had let me down. Secondly within the Backblaze preferences I could find no way to re-enable this.
This - the nail in the coffin
"The Backup Client now excludes popular cloud storage providers [...] this change aligns with Backblaze’s policy to back up only local and directly connected storage."
I guess windows 10 and 11 users aren't backing up much to Backblaze, since microsoft is tricking so many into moving all of their data to onedrive.
I hope Backblaze responds to this with a "we're sorry and we've fixed this."
[1] https://www.backblaze.com/cloud-backup/personal
Don't even know why people rely on these guis which can show their magic anytime
* If your value your privacy, you need to encrypt the files on the client before uploading.
* You need to keep multiple revisions of each file, and manage their lifecycle. Unless you're fine with losing any data that was overwritten at the time of the most recent backup.
* You need to de-duplicate files, unless you want bloat whenever you rename a file or folder.
* Plus you need to pay for Amazon's extortionate egress prices if you actually need to restore your data.
I certainly wouldn't want to handle all that on my own in a script. What can make sense is using open source backup software with S3/R2/B2 as backing storage.
Most people (my mom) don't know what s3 and r2 is or how to use it.
I like how you can set multiple keys (much like LUKS) so that the key used by scheduled backups can be changed without messing with the key that I have memorized to restore with when disaster strikes.
It also means you can have multiple computers backing up (sequentially, not simultaneously) to the same repository, each with their own key.
also, you pay per-GB. the author is on backblaze's unlimited plan.
We're also seeing this play out in real time with Anthropic with their poop-splatter-llm. They've gone through like 4 rug-pulls, and people STILL pay $200/month for it. Every round, their unlimited gets worse and worse, like I outlined above.
Pay as you go is probably the more fair. But SaaS providers reallllllly hate in providing direct and easy to use tools to identify costs, or <gasp> limit the costs. A storag/backup provider could easily show this. LLM providers could show a near-realtime token utilization.
But no. Dark patterns, rug-pulls, and "i am altering the deal, pray i do not alter it further".
If they start excluding random content (eg: .git) without effective notice, maybe they AREN'T backing up everything you think they are.
My naive idea: Download 100 TB every 3 month to a 2nd device, create a list of files restored, validate checksums with the original machine, make a list of files differing and missing, check which ones are supposed to be missing? That sounds like a full time job.
I get that changing economics make it more difficult to honor the original "Backup Everything" promise but this feels very underhanded. I'll be cancelling.
The configuration and logging formats they use are absolutely nonsensical.
I’m only in my 40’s, I don’t require glasses (yet) and I have to actively squint to read your site on mobile. Safari, iPhone.
I’m pretty sure you’re under the permitted contrast levels under WCAG.
Is this maybe a pixel density of iphone issue?
I wouldn't mind a darker and higher weight font though.
- ctrl-shift-. to show hidden files on macOS - pull down to see search box (iOS 18) - swipe from top right corner for flashlight button - swipe up from lower middle for home screen
Etc, etc
I have a gesture for whoever decided "find in page" should go under share.
You can also just type your search term into the normal address bar and there's an item at the bottom of the list for "on this page - find <search>". I'd never even seen the find-in-page button under share.
Long pressing is much more pleasant.
I wish Apple would give us a hint rather than requiring us to chance upon this recommendation on HN.
Use this command in the developer tools console to change the color.
For desktop browsers, I also have a bookmarklet on the bookmarks bar with the following Javascript:
It doesn't darken the text on every webpage but it does work on this thread's article. (The Javascript code can probably be enhanced with more HTML heuristics to work on more webpages.)One day try throwing a pair on you'll be surprised. The small thin font is causing this not the text contrast. This and low light scenarios are the first things to go.
Whatever causes it, I do wear glasses (and on a recent prescription too) and the text is still very hard to read.
Firefox users: press F9 or C-A-R
What is it supposed to do?
There is no mention of F9 on this support page either:
https://support.mozilla.org/en-US/kb/keyboard-shortcuts-perf...
Am I missing something?
F9 on Windows
Ctrl + Alt + R on Linux
Command + Option + R on macOS
(It uses JS to only show the one for your platform but with view source you can see it mentions all three of these different OSes.)
So I guess the first guy is a Windows user and you other two use Linux.
There is a dropdown at the top-right to select the platform - no need to view source.
As for mentioning WCAG - so what if it doesn’t adhere to those guidelines? It’s his personal website, he can do what he wants with it. Telling him you found it difficult to read properly is one thing but referencing WCAG as if this guy is bound somehow to modify his own aesthetic preference for generic accessibility reasons is laughable. Part of what continues to make the web good is differing personal tastes and unique website designs - it is stifling and monotonous to see the same looking shit on every site and it isn’t like there aren’t tools (like reader mode) for people who dislike another’s personal taste.
Pinning this squarely on user error. Backblaze could clearly have done better, but it's such a well known failure mode that it's not much far off refusing to test restores of a bunch of tapes left in the sun for a decade.
It isn't user error if it was working perfectly fine until the provider made a silent change.
Unless the user error you are referring to is not managing their own backups, like I do. Though this isn't free from trouble, I once had silent failures backing up a small section of my stuff for a while because of an ownership/perms snafu and my script not sending the reports to stderr to anywhere I'd generally see them. Luckily an automated test (every now & then it scans for differences in the whole backup and current data) because it could see the source and noticed a copy wasn't in the latest snapshot on the far-away copy. Reliable backups is a harder problem then most imagine.
Also consider e.g. ~/.cache/thumbnails. It's easy to understand as a cache, but if the thumbnails were of photos on an SD card that gets lost or immediately dies, is it still a cache? It might be the only copy of some once-in-a-lifetime event or holiday where the card didn't make it back with you. Something like this actually happened to me, but in that case, the "cache" was a tarball of an old photo gallery generated from the originals that ought to have been deleted.
It's just really hard to know upfront whether something is actually important or not. Same for the Downloads folder. Vendor goes bankrupt, removes old software versions, etc. The only safe thing you can really do is hold your nose and save the whole lot.