Discourse: Pitfalls When Bulk-Creating VM Accounts and API Keys

Background

We need to provision each of the 8 VMs (fedora-devops, kali, elementary, studio, fedora-ai, translate, debian, modiqi) on wpcommunity.com with a dedicated forum-cli account and a Single User API key.

What seemed like a straightforward task turned out to involve several pitfalls—here’s a record of them.

Pitfall 1: MCP Discourse Tool Returns 404 for Some Users

Using the Discourse MCP tool’s get_user command to query the 8 usernames returned a successful response only for modiqi; all others returned 404.

However, running User.where(username: ...) via Rails runner confirmed that 7 of the 8 users actually exist in the database (only kali was missing).

Root Cause: These bot/system users likely have non-public profiles, or the MCP tool uses public API endpoints instead of admin-only ones.

Lesson Learned: Never fully trust MCP tool 404 responses for bot/system users. Always double-check existence using Rails runner or admin API endpoints.

Pitfall 2: Single User API Keys Lack Admin Privileges

The API key for wenpai-dev has Single User scope and admin: false. It cannot be used to create users or generate new API keys via the API.

Even attempting to use the system username with the same key fails outright with invalid_access.

Conclusion: Admin-level operations—such as user creation and API key management—must be performed exclusively via SSH → Docker → Rails runner. Given their infrequency, distributing a global admin API key is not justified.

Pitfall 3: Escaping Hell with SSH + Docker + Rails Runner

Executing Ruby code via ssh prod-b "docker exec app su discourse -c '...'" involves three layers of shell quoting—and escaping quickly becomes error-prone.

This is especially true when mixing Ruby’s save!, string interpolation (#{}), and heredocs: bash may interpret ! as history expansion and escape it as \!, causing Ruby syntax errors.

Solution: Avoid inline Ruby entirely. Instead:

# Write script locally
scp script.rb prod-b:/tmp/
ssh prod-b "docker cp /tmp/script.rb app:/tmp/ && \
  docker exec app chown discourse:discourse /tmp/script.rb && \
  docker exec app su discourse -c \
    'cd /var/www/discourse && RAILS_DB=default bundle exec rails runner /tmp/script.rb'"

This is currently the most robust approach—fully avoiding quoting/escaping issues.

Pitfall 4: User#approve No Longer Exists in Newer Discourse Versions

Calling user.approve(Discourse.system_user) after user creation raises undefined method 'approve'.

In newer Discourse versions (Rails 8 + Discourse 3.x), the approval logic has been refactored. Approval status should now be set directly during creation:

user = User.create!(
  username: "kali",
  email: "[email protected]",
  password: SecureRandom.hex(16),
  active: true,
  approved: true,
  trust_level: 0
)
user.activate

Pitfall 5: ApiKey#key Is Only Readable Immediately After Creation

This is the sneakiest one. Discourse’s ApiKey model enforces strict access control on the key field:

API key is only accessible immediately after creation (ApiKey::KeyAccessError)

That is, once an API key is created, its plaintext value cannot be retrieved again via Rails (the database stores only its hash).

Solution: To obtain the plaintext key, you must first destroy any existing keys for the user, then create a new one and read its .key attribute immediately:

ApiKey.where(user_id: user.id).destroy_all
api_key = ApiKey.new(user_id: user.id, description: "...", created_by_id: -1)
api_key.save!
raw_key = api_key.key  # This is your *only* chance to read the plaintext key

Pitfall 6: Rails Runner Must Be Run as the discourse User

Running docker exec app rails runner ... as root results in PostgreSQL role permission errors. You must switch to the discourse user:

docker exec app su discourse -c 'cd /var/www/discourse && bundle exec rails runner ...'

In multi-site environments, also include RAILS_DB=default.

Final Outcome

VM User ID Status
fedora-devops 34 Already exists
kali 44 Newly created
elementary 38 Already exists
studio 39 Already exists
fedora-ai 40 Already exists
translate 36 Already exists
debian 43 Already exists
modiqi 2 Already exists (admin)

All 8 Single User API keys have been successfully generated and stored at /mnt/shared-context/secrets/wpcommunity-vm-api-keys.json.

Pitfall 7 (Most Critical): Incorrect RAILS_DB Configuration for Multi-Site Setup

After falling into all six pitfalls above, the API keys were distributed to each VM—but weixiaoduo returned HTTP 403—even GET /categories.json failed.

Troubleshooting steps:

  • User status was normal (active, approved, tl=2, not suspended).
  • The API key had not been revoked, and its scopes were empty (i.e., full permissions).
  • Regenerating a new Global API key still resulted in 403.
  • Using curl -v, we observed Discourse returning: Invalid API username or key.

Root cause identified: wpcommunity.com is configured as an independent database within Discourse’s multi-site setup.

# config/multisite.yml
wpcommunity:
  adapter: postgresql
  database: wpcommunity_discourse
  host_names:
    - wpcommunity.com
    - www.wpcommunity.com
RailsMultisite::ConnectionManagement.all_dbs
# => ["default", "wpcommunity"]

All prior operations used RAILS_DB=default, meaning users and API keys were created exclusively in the default database (meta.cyberforums.com). However, requests to wpcommunity.com route to the wpcommunity database—where none of those users or keys exist—hence the inevitable 403.

Fix: Re-created all 7 missing VM users and 9 Global API keys using RAILS_DB=wpcommunity. Verification passed.

The user IDs listed in the “Final Results” table in the original post (e.g., fedora-devops=34, kali=44) belong to the default database; IDs in the wpcommunity database are entirely different.

Does forum-cli need modification?

No code changes required. forum-cli merely uses an API key and URL to call endpoints—it has no dependency on the underlying database. Both sites’ .conf files are correctly configured:

Configuration File Site Database
.forum-cli.conf meta.cyberforums.com default
.forum-cli-wpcommunity.conf wpcommunity.com wpcommunity

The issue lies solely in the API key creation process, not in forum-cli.

Prevention Checklist: Avoiding This Pitfall Again

Before running any Rails runner command on this Discourse instance, always first execute:

puts RailsMultisite::ConnectionManagement.all_dbs

Then select the correct RAILS_DB value based on the target domain:

  • meta.cyberforums.comRAILS_DB=default
  • wpcommunity.comRAILS_DB=wpcommunity

If uncertain, consult the host_names mapping in config/multisite.yml.

Standard procedure when adding a new VM:

  1. Confirm the RAILS_DB corresponding to the target site.
  2. Create the user and API key using the correct RAILS_DB.
  3. Immediately verify via curl (GET /categories.json)—do not wait until the VM reports an error before investigating.