403 errors when downloading instructions #123

Open
opened 2025-12-23 02:26:32 +01:00 by bricklemur · 8 comments

Hi,

I'm not able to download instructions from rebrickable. I've tried with several sets and always get "Error: No instructions found on Rebrickable or Peeron." Adding sets and parts works as expected.

Looks like I get a 403 in the logs, while manually checking the url does show results:

https://rebrickable.com/instructions/40751-1/

Logs:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): rebrickable.com:443
DEBUG:urllib3.connectionpool:[https://rebrickable.com:443](https://rebrickable.com/) "GET /instructions/40751-1/ HTTP/1.1" 403 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): peeron.com:80
DEBUG:urllib3.connectionpool:[http://peeron.com:80](http://peeron.com/) "GET /scans/40751-1 HTTP/1.1" 302 None
DEBUG:urllib3.connectionpool:[http://peeron.com:80](http://peeron.com/) "GET /cgi-bin/invcgis/scans/40751-1?ct=1 HTTP/1.1" 200 2563

I've checked for similar issues and saw that one was solved several months ago, but no comments on how.

I'm currently on v1.3.1

This is my compose:

bricktracker:
    profiles: ["all"]
    image: gitea.baerentsen.space/frederikbaerentsen/bricktracker:1.3.1
    container_name: bricktracker
    restart: unless-stopped
    user: ${PUID}:${PGID}
    tty: false
    stdin_open: false
    # read_only: true 
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    tmpfs: 
      - /tmp:uid=${PUID},gid=${PGID},rw,noexec,nosuid,nodev,size=512m
    mem_limit: 1g
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "5"
    networks:
      - frontend_apps
    volumes:
      - ./data/bricktracker:/app/data/
    environment:
      - BK_REBRICKABLE_API_KEY=${REBRICKABLE_API_KEY}
      - BK_DEBUG=true
Hi, I'm not able to download instructions from rebrickable. I've tried with several sets and always get "Error: No instructions found on Rebrickable or Peeron." Adding sets and parts works as expected. Looks like I get a 403 in the logs, while manually checking the url does show results: https://rebrickable.com/instructions/40751-1/ Logs: ```DEBUG:bricktracker.instructions:[find_instructions] fetching HTML from 'https://rebrickable.com/instructions/40751-1/' DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): rebrickable.com:443 DEBUG:urllib3.connectionpool:[https://rebrickable.com:443](https://rebrickable.com/) "GET /instructions/40751-1/ HTTP/1.1" 403 None DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): peeron.com:80 DEBUG:urllib3.connectionpool:[http://peeron.com:80](http://peeron.com/) "GET /scans/40751-1 HTTP/1.1" 302 None DEBUG:urllib3.connectionpool:[http://peeron.com:80](http://peeron.com/) "GET /cgi-bin/invcgis/scans/40751-1?ct=1 HTTP/1.1" 200 2563 ``` I've checked for similar issues and saw that one was solved several months ago, but no comments on how. I'm currently on v1.3.1 This is my compose: ``` bricktracker: profiles: ["all"] image: gitea.baerentsen.space/frederikbaerentsen/bricktracker:1.3.1 container_name: bricktracker restart: unless-stopped user: ${PUID}:${PGID} tty: false stdin_open: false # read_only: true security_opt: - no-new-privileges:true cap_drop: - ALL tmpfs: - /tmp:uid=${PUID},gid=${PGID},rw,noexec,nosuid,nodev,size=512m mem_limit: 1g logging: driver: json-file options: max-size: "50m" max-file: "5" networks: - frontend_apps volumes: - ./data/bricktracker:/app/data/ environment: - BK_REBRICKABLE_API_KEY=${REBRICKABLE_API_KEY} - BK_DEBUG=true ```

Not sure what version of the container you are on but I had this problem for about a month until I got on the latest version. I had to go through the upgrade steps to do it, but once I got up to date that went away

Not sure what version of the container you are on but I had this problem for about a month until I got on the latest version. I had to go through the upgrade steps to do it, but once I got up to date that went away

That’s honestly super weird. Your logs suggests you have the error that was happening on 1.2.4 but if you are sure you version is 1.3.1 then I don’t know what the issue is. Do you have “ BrickTracker (1.3.1)” in the button left corner of the front page?

I don’t have the issue on my own instance and I setup a new instance using latest at I don’t see the issue there either.

That’s honestly super weird. Your logs suggests you have the error that was happening on 1.2.4 but if you are sure you version is 1.3.1 then I don’t know what the issue is. Do you have “ BrickTracker (1.3.1)” in the button left corner of the front page? I don’t have the issue on my own instance and I setup a new instance using `latest` at I don’t see the issue there either.
FrederikBaerentsen added the Kind/Bug label 2025-12-23 16:25:13 +01:00
Author

Thanks for quick reply :) It does show 1.3.1 in the bottom left corner of the page.

i tried using latest and also tried removing all the additional hardening so it pretty much matched your compose example, but still no luck.

Also tried spinning up a fresh container and new volumes and same issue.

It looks like cloudflare may be blocking the container. I couldn't test curl or wget inside the container, but with python i get the following:

$ python - <<'EOF'
import requests
r = requests.get(
    "https://rebrickable.com/instructions/40619-1/",
    headers={"User-Agent": "Mozilla/5.0"}
)
print("status:", r.status_code)
print("headers:", r.headers)
print("body snippet:", r.text[:300])
EOF> > > > > > > > >
status: 403
headers: {'Date': 'Tue, 23 Dec 2025 23:55:01 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'close', 'accept-ch': 'Sec-CH-UA-Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-Full-Version, UA-Mobile, UA-Model, UA-Platform-Version, UA-Platform, UA', 'cf-mitigated': 'challenge', 'critical-ch': 'Sec-CH-UA-Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-Full-Version, UA-Mobile, UA-Model, UA-Platform-Version, UA-Platform, UA', 'cross-origin-embedder-policy': 'require-corp', 'cross-origin-opener-policy': 'same-origin', 'cross-origin-resource-policy': 'same-origin', 'origin-agent-cluster': '?1', 'permissions-policy': 'accelerometer=(),browsing-topics=(),camera=(),clipboard-read=(),clipboard-write=(),geolocation=(),gyroscope=(),hid=(),interest-cohort=(),magnetometer=(),microphone=(),payment=(),publickey-credentials-get=(),screen-wake-lock=(),serial=(),sync-xhr=(),usb=()', 'referrer-policy': 'same-origin', 'server-timing': 'chlray;desc="9b2bc997ab6eee21"', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v4?s=IgEW%2B4e1ZWtkJlSpIQAgjI2g3hu%2Bxua%2BELBNAlkDOoG44jfmMZaHclXnEhnYtQiWVI9jxJs1r38hBy87CPXZZbEGqfRrUfjsxy%2BKxUiBzyxsbKWc991ohZXPMYCFijnpfw%3D%3D"}],"group":"cf-nel","max_age":604800}', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Vary': 'Accept-Encoding', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains', 'Server': 'cloudflare', 'CF-RAY': '9b2bc997ab6eee21-BKK', 'Content-Encoding': 'gzip'}
body snippet: <!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><meta name="robots" content="noindex,nofollow"><meta name="viewport" content="width=device-width,initial-scal

when i try curl from my NAS, i get an http 200 response.

Thanks for quick reply :) It does show 1.3.1 in the bottom left corner of the page. i tried using latest and also tried removing all the additional hardening so it pretty much matched your compose example, but still no luck. Also tried spinning up a fresh container and new volumes and same issue. It looks like cloudflare may be blocking the container. I couldn't test curl or wget inside the container, but with python i get the following: ``` $ python - <<'EOF' import requests r = requests.get( "https://rebrickable.com/instructions/40619-1/", headers={"User-Agent": "Mozilla/5.0"} ) print("status:", r.status_code) print("headers:", r.headers) print("body snippet:", r.text[:300]) EOF> > > > > > > > > status: 403 headers: {'Date': 'Tue, 23 Dec 2025 23:55:01 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'close', 'accept-ch': 'Sec-CH-UA-Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-Full-Version, UA-Mobile, UA-Model, UA-Platform-Version, UA-Platform, UA', 'cf-mitigated': 'challenge', 'critical-ch': 'Sec-CH-UA-Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-Full-Version, UA-Mobile, UA-Model, UA-Platform-Version, UA-Platform, UA', 'cross-origin-embedder-policy': 'require-corp', 'cross-origin-opener-policy': 'same-origin', 'cross-origin-resource-policy': 'same-origin', 'origin-agent-cluster': '?1', 'permissions-policy': 'accelerometer=(),browsing-topics=(),camera=(),clipboard-read=(),clipboard-write=(),geolocation=(),gyroscope=(),hid=(),interest-cohort=(),magnetometer=(),microphone=(),payment=(),publickey-credentials-get=(),screen-wake-lock=(),serial=(),sync-xhr=(),usb=()', 'referrer-policy': 'same-origin', 'server-timing': 'chlray;desc="9b2bc997ab6eee21"', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v4?s=IgEW%2B4e1ZWtkJlSpIQAgjI2g3hu%2Bxua%2BELBNAlkDOoG44jfmMZaHclXnEhnYtQiWVI9jxJs1r38hBy87CPXZZbEGqfRrUfjsxy%2BKxUiBzyxsbKWc991ohZXPMYCFijnpfw%3D%3D"}],"group":"cf-nel","max_age":604800}', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Vary': 'Accept-Encoding', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains', 'Server': 'cloudflare', 'CF-RAY': '9b2bc997ab6eee21-BKK', 'Content-Encoding': 'gzip'} body snippet: <!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><meta name="robots" content="noindex,nofollow"><meta name="viewport" content="width=device-width,initial-scal ``` when i try curl from my NAS, i get an http 200 response.

The BrickTracker code actually uses sessions and not just requests. This way we don't run info cloudflare issues. Previously used cloudscraper but that hasn't been updated in some time and would get caught by cloudflare.

Your code example, also returns 403 for me.

A simplified version of the BrickTracker code is

#  cf_test.py
import requests
session = requests.Session()
session.headers.update({
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.5',
    'DNT': '1',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Cache-Control': 'max-age=0'
})

set_number='40619-1'

instructions_page = f"https://rebrickable.com/instructions/{set_number}/"

response = session.get(instructions_page, stream=True, allow_redirects=True)

session.headers.update({"Referer": instructions_page})

print(f'Status Code: {response.status_code}, Content: {response.text}')

If you run this using python cf_test.py | grep /download/?expire you should see a list of the instructions direct download links like

❯ python cf.py | grep /download/?expire
      <a href="/instructions/231933/4523d52e3b72580f6c29d60a018f6336960003f7497f15c4a6dd87daac976d13/download/?expire=1766591800" target="_blank">
      <a href="/instructions/231934/7ac8e911733f4a58f0d3cbb9f843c493244813a09e3e80d0d1168eac461878dd/download/?expire=1766591800" target="_blank">
      <a href="/instructions/231935/eacaec207beaac6063b74fa21f128b32ff6481b9e4c8f7c958a2b6e178156dc1/download/?expire=1766591800" target="_blank">
      <a href="/instructions/231936/8953bc99f7984185f37c8867233de317ec24e3c87f496ea37b51f705c119f841/download/?expire=1766591800" target="_blank">

The BrickTracker code actually uses sessions and not just requests. This way we don't run info cloudflare issues. Previously used cloudscraper but that hasn't been updated in some time and would get caught by cloudflare. Your code example, also returns 403 for me. A simplified version of the BrickTracker code is ```python # cf_test.py import requests session = requests.Session() session.headers.update({ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8', 'Accept-Language': 'en-US,en;q=0.5', 'DNT': '1', 'Connection': 'keep-alive', 'Upgrade-Insecure-Requests': '1', 'Sec-Fetch-Dest': 'document', 'Sec-Fetch-Mode': 'navigate', 'Sec-Fetch-Site': 'same-origin', 'Cache-Control': 'max-age=0' }) set_number='40619-1' instructions_page = f"https://rebrickable.com/instructions/{set_number}/" response = session.get(instructions_page, stream=True, allow_redirects=True) session.headers.update({"Referer": instructions_page}) print(f'Status Code: {response.status_code}, Content: {response.text}') ``` If you run this using `python cf_test.py | grep /download/?expire` you should see a list of the instructions direct download links like ``` ❯ python cf.py | grep /download/?expire <a href="/instructions/231933/4523d52e3b72580f6c29d60a018f6336960003f7497f15c4a6dd87daac976d13/download/?expire=1766591800" target="_blank"> <a href="/instructions/231934/7ac8e911733f4a58f0d3cbb9f843c493244813a09e3e80d0d1168eac461878dd/download/?expire=1766591800" target="_blank"> <a href="/instructions/231935/eacaec207beaac6063b74fa21f128b32ff6481b9e4c8f7c958a2b6e178156dc1/download/?expire=1766591800" target="_blank"> <a href="/instructions/231936/8953bc99f7984185f37c8867233de317ec24e3c87f496ea37b51f705c119f841/download/?expire=1766591800" target="_blank"> ```

I am having the same problem; the output of the sample cf_test.py code does properly show instruction sets, but any attempt to download within 1.3.1 gives an error of no instructions found.

I am having the same problem; the output of the sample cf_test.py code does properly show instruction sets, but any attempt to download within 1.3.1 gives an error of no instructions found.
FrederikBaerentsen added the
Priority
Low
label 2026-01-01 17:02:28 +01:00
Author

Hey @FrederikBaerentsen sorry for the late follow up. i've tried cf_test.py you shared and the output was blank. so it looks like it isn't working either.

any other suggestions?

Hey @FrederikBaerentsen sorry for the late follow up. i've tried cf_test.py you shared and the output was blank. so it looks like it isn't working either. any other suggestions?
Author

update - i tried putting it behind a gluetun network and instruction downloads work now

update - i tried putting it behind a gluetun network and instruction downloads work now

update - i tried putting it behind a gluetun network and instruction downloads work now

That sounds like it is an issue with your IP. if using a gluetun VPN (which i am assuming gives you a new IP) works.

> update - i tried putting it behind a gluetun network and instruction downloads work now That sounds like it is an issue with your IP. if using a gluetun VPN (which i am assuming gives you a new IP) works.
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: FrederikBaerentsen/BrickTracker#123