Adventures in Self-Hosting: Brave-Go-Sync

Ever since I started down this route with Glanceapp & Authentication to solve a "simple efficiency problem," the bug of wanting to move more and more services directly under my control hasn't quite stopped. If anything, it's rekindled the want to maintain tech skills that I had left behind over the years (among other excuses that will certainly pop up).

As an aside, I'll be weaving in a series of "Adventures in Self-Hosting" detailing thought processes of why you might want to self-host a service, always with a sprinkle of product thinking. If you want things to "just work," this isn't necessarily a path to go down, but if you're wanting to get under the hood and learn a few things to help build some bridges with engineering, it's a decent exercise.

The Why

Browsers are a thing we use, most people don't give a second thought about what they are using to browse the web. It's a central part of most user's internet experience, though. And, for those who have been around the block a few times (Netscape, anyone?), we all know the compatibility pain it was just a handful of years ago. I think we could all agree that most people will just use whatever browser is pre-installed on their computer and call it a day.

Which leads us to where the evolution of this, now, very basic tool is today. Standards are slanted heavily towards one engine, Chromium. (For completeness, yes, there is Firefox with Gecko, and Safari, if you're on a Mac, and an army of smaller projects out there.) Any Product Manager worth his salt is going to say that your behaviors using their tool is invaluable - for bug fixing, for planning features, understanding the user base, and for monetization. While compatibility and new features for web development are still a concern, they're also mixed in with other motivations.

We use our browsers for some really personal stuff and while not everything is tracked, (potentially) anonymized, and analyzed to serve you more ~~lucrative~~ interesting features. Lots of these browsers offer an option to synchronize settings, bookmarks, history, and other browser-based data. In an anonymized form, this is all still a treasure trove of data.

In that regard, I opted for Brave which, at least superficially, provides some modicum of privacy (and not without some level of controversy in its past). Since, like most of the Chromium-based browsers, is open source, I noticed that they provided a stable version (for some values of stable) of their source code for their sync service. And, out of a mixture of a dash of curiosity and a heaping spoonful of "why not?", I decided to experiment with it and self-host it.

We could break this down into a JTBD framework, if we were so inclined. A breakdown might look like this:

Main Jobs To Be Done - Brave Go Sync

Functional Aspects

I want to be able to synchronize my Brave browser data to a private server so that all my devices can have access to the data and provide a seamless experience so I can pick up exactly where I left off on any device.

Emotional Aspects

I want to be able to synchronize my Brave browser data to a private server so that I feel that my private browser data is really private and safeguarded.

Personal Dimension

By feeling that my browser data is secure, I know that my personal information is safe(r).

Social Dimension

By self hosting my private browser data, I am reflecting how privacy conscious of a user I am.

This is just a quick, generic, breakdown of why someone might want to "hire" Brave-Go-Sync instead of using, let's say, Brave's own built in sync service or Chrome's or Edge's, or anyone else's. I could have just as easily down a breakdown that I am "hiring" Brave-Go-Sync's self-hosting service because I want to write a post about it, and how that feeds into my challenge-solving emotional needs, etc.

The point of something like the JTBD framework is to help visualize what the user wants to accomplish, motivations (among, potentially other things), and what success looks like for that user. You may end up with several "jobs" a user might have. You may also end up with several, sufficiently, distinct motivations and dimensions for a single "job." Those are all now testable against the market, providing different personas, or marketing messages, to ensure you hit your target market.

The How

Now, for the juicy part. If you took a look at the repo, it's not exactly in the most friendliest of states for self-hosting. The code, as it stands, starts up a Dynamo DB that stores everything in memory which is not ideal for most people. Your server goes down, you update Docker, or you restart because of maintenance, and poof goes your sync data. No one wants that (unless you're really into ephemeral data sync for some reason).

ippocratis did a pretty decent write up on how he deployed Brave's Go Sync service for his environment. It was the direct inspiration (and me not realizing that I had restarted all the containers and losing the sync chain) for this post, but adjusted for a standard x64/x86 docker environment.

First thing is first, go to an appropriate directory on your server, something like ~/git-projects, and run:

git clone https://github.com/brave/go-sync

Go into the newly cloned directory. You're going to want to edit 3 files:

Edit or replace the dynamo.Dockerfile
Add the dynamo-entrypoint.sh
Edit or replace the docker-compose.yml

dynamo.Dockerfile

I went the same route as ippocratis with corretto. You could use a newer version but this one is known to work. As with many users that have tried to make Brave Go Sync have persistent data, I initially ran into a 500 error when trying to write to the database. What I found was that I was not creating the schema in the database on build. I added an entrypoint.sh to ensure the image built correct but more on that later.

Update or replace the dynamo.Dockerfile with the following:

FROM amazoncorretto:11

WORKDIR /app

# Install dependencies
USER root
RUN yum install -y \
    shadow-utils \
    curl \
    unzip \
    python3 \
    python3-pip \
    tar \
    gzip && \
    yum clean all

# Download and extract DynamoDB Local
RUN curl -sL https://s3.us-west-2.amazonaws.com/dynamodb-local/dynamodb_local_latest.tar.gz -o dynamodb.tar.gz && \
    mkdir /app/dynamodb && \
    tar -xzf dynamodb.tar.gz -C /app/dynamodb && \
    rm dynamodb.tar.gz

# Move DynamoDBLocal.jar and lib folder to /app
RUN mv /app/dynamodb/DynamoDBLocal.jar /app/ && \
    mv /app/dynamodb/DynamoDBLocal_lib /app/ && \
    rm -rf /app/dynamodb

# Install AWS CLI
RUN curl -sL "https://awscli.amazonaws.com/awscli-exe-linux-$(uname -m).zip" -o "awscliv2.zip" && \
    unzip -q awscliv2.zip && \
    ./aws/install && \
    rm -rf awscliv2.zip aws

# Environment
ENV AWS_ACCESS_KEY_ID=GOSYNC
ENV AWS_SECRET_ACCESS_KEY=GOSYNC
ENV AWS_ENDPOINT=http://localhost:8000
ENV AWS_REGION=us-west-2
ENV TABLE_NAME=client-entity-dev
ENV PATH="/usr/local/bin:$PATH"

# Copy schema
COPY schema/dynamodb/ /app
COPY dynamodb-entrypoint.sh /app
RUN chmod +x /app/dynamodb-entrypoint.sh
RUN mkdir -p /app/db

# Optional healthcheck
HEALTHCHECK --interval=5s --timeout=3s --retries=3 \
  CMD curl -f http://localhost:8000 || exit 1

WORKDIR /app

# Final CMD — run DynamoDB Local correctly
CMD ["/app/dynamodb-entrypoint.sh"]

dynamo-entrypoint.sh

The dynamo.Dockerfile originally had the schema creation built in but I moved it out into this shell script for extra comfort. I am sure that ippocratis' Dockerfile was fine but the checking to make sure the table existed was a precautionary step.

#!/bin/sh
set -e

# Start DynamoDB Local in background
java -Djava.library.path=./DynamoDBLocal_lib \
     -cp DynamoDBLocal.jar:./DynamoDBLocal_lib/* \
     com.amazonaws.services.dynamodbv2.local.main.ServerRunner \
     -sharedDb -dbPath /app/db &

DYNAMO_PID=$!

# Wait for DynamoDB Local to start up
echo "Waiting for DynamoDB Local to start..."
sleep 5

# Create table if it doesn't exist
TABLES=$(aws dynamodb list-tables --endpoint-url $AWS_ENDPOINT --region $AWS_REGION --no-cli-pager | grep $TABLE_NAME || true)
if [ -z "$TABLES" ]; then
    echo "Creating table..."
    aws dynamodb create-table --cli-input-json file:///app/table.json \
      --endpoint-url $AWS_ENDPOINT --region $AWS_REGION
    echo "Enabling TTL..."
    aws dynamodb update-time-to-live --table-name $TABLE_NAME \
      --time-to-live-specification "Enabled=true, AttributeName=ExpirationTime" \
      --endpoint-url $AWS_ENDPOINT --region $AWS_REGION
else
    echo "Table $TABLE_NAME already exists, skipping creation."
fi

# Wait and keep DynamoDB Local running
wait $DYNAMO_PID

The Build

With those 2 files in your git project, build the two images so you're not trying to rebuild them in the docker compose file. That's going to look like this:

docker build -t go-sync-dynamo-local:latest -f dynamo.Dockerfile .

And...

docker build -t go-sync-app:latest -f Dockerfile .

You should now have 2 additional images available for Docker to use. You can check by running:

docker images go*

The Docker Compose File

Great! Now you have images pre-built and now all that is missing a docker-compose.yml to get the service up and running. Since we want data persistence for the database, we'll need to create a directory or a docker volume. I used a directory, like ~/brave-go-sync/dynamo-data, but either way works. Once that's created, using your favorite editor (and why is it not vim?) create a file called docker-compose.yml, also could be in ~/brave-go-sync, so it looks like this:

services:
  web:
    image: go-sync-app:latest # using the prebuild image from ~/git-projects
    container_name: brave-go-sync
    restart: always
    # if you want to limit the memory this container can use
    deploy:
      resources:
        limits:
          memory: 512M
    ports:
      - "8295:8295"
    depends_on:
      - dynamo-local
      - redis
    environment:
      - PPROF_ENABLED=true
      - SENTRY_DSN
      - ENV=local
      - DEBUG=1
      - AWS_ACCESS_KEY_ID=GOSYNC
      - AWS_SECRET_ACCESS_KEY=GOSYNC
      - AWS_REGION=us-west-2
      - AWS_ENDPOINT=http://brave-go-dynamo:8000
      - TABLE_NAME=client-entity-dev
      - REDIS_URL=brave-go-redis:6379

  dynamo-local:
    image: go-sync-dynamo-local:latest # using the prebuild image from ~/build-projects
    container_name: brave-go-dynamo
    restart: always
    # if you want to limit the memory this container can use
    deploy:
      resources:
        limits:
          memory: 512M
    ports:
      - "8000:8000"
    volumes:
      - /path-to/brave-go/sync/dynamo_data:/app/db
    environment:
      - PPROF_ENABLED=true
      - SENTRY_DSN
      - ENV=local
      - DEBUG=1
      - AWS_ACCESS_KEY_ID=GOSYNC
      - AWS_SECRET_ACCESS_KEY=GOSYNC
      - AWS_REGION=us-west-2

  redis:
    image: redis:6.2
    container_name: brave-go-redis
    restart: always
    deploy:
      resources:
        limits:
          memory: 128M
    ports:
      - "6379:6379"
    environment:
      - ALLOW_EMPTY_PASSWORD=yes

This is assuming you're not running any additional services on those ports (but let's face it, if you're reading about how to host you own sync service for Brave, you probably do), but if you are, you'll want to comment out the relevant bits like so:

# ports
#   - "8000:8000"

You'll then be using your reverse proxy (nginx, nginx proxy manager, caddy, traefik, etc.), following their instructions on how to map a host to a docker container using the container name. Make sure you have a certificate for that host and you're almost done.

As of this writing, Brave doesn't support custom sync servers (unless you like to live dangerously and are running Brave Nightly), but there is a workaround. If you're on Windows, open your shortcut and for the Target, use something like:

"C:\Program Files\BraveSoftware\Brave-Browser\Application\brave.exe" --sync-url="https://sync.yourdomain.com/v2"

If you want to modify your Start Menu shortcut, it lives in ProgramData > Microsoft > Windows > Start Menu > Programs, usually.

Start up Brave and go to brave://sync-internals. If you see your private server next to Server URL under Version info as "https://sync.yourdomain.com/v2"... GRATZ! You're all set to start your privately hosted sync chain. If not, read on!

When I started on this adventure, there was a bug in Brave that was stripped out the "/v2" from the Server URL. Don't need me to tell you That's Bad(tm). But there's (yet another) workaround. Using nginx, you'll have to add an additional location for that virtual host. It's going to look somewhat like this:

location / {
    # Bypass rewrite if URL is already correctly formatted
    if ($uri ~* ^/v2/command) {
        proxy_pass http://brave-go-sync:8295;
        break;
    }
    
    # Rewrite all other /command requests to /v2/command
    proxy_pass http://brave-go-sync:8295/v2;
    rewrite ^/command(/.*)? /v2/command$1 break;
}

Restart that host and now (hopefully) you're golden!

See? Didn't I say it was an adventure?