Automate Let's Encrypt for 10–150 WordPress Sites: A Practical 7-Step Plan Agencies and Freelancers Can Use

1) Why automating Let's Encrypt stops the firefighting and keeps clients happy

If you manage between 10 and 150 WordPress sites you already know the pattern: a certificate expires at a bad hour, a client panics, and you drop everything to log into a dozen dashboards. Let's Encrypt removes the cost barrier, but without a repeatable process it just changes the type of work you do from buying certs to triaging renewals. Automating issuance and renewal across many sites buys you predictable operations, fewer late-night calls, and a credible security story you can present to clients.

Automation reduces human error, but it also creates a single point of failure if done carelessly. That tradeoff matters: one misconfigured automation can cause mass outages or mass key exposure. This list shows how to build automation with containment, monitoring, and recovery so you gain reliability without multiplying risk.

Think of the result in simple terms: regular renewals that just happen, tests that catch failures before they affect users, and clear playbooks when something unusual does occur. If you run an agency, that translates into predictable SLAs, less staff time wasted on emergency fixes, and fewer unhappy clients. If you’re a freelancer, it means you can scale beyond a handful of sites without constant firefighting.

image

2) Standardize hosting and DNS so ACME challenges work reliably

Automation starts with standardization. ACME protocols (HTTP-01 and DNS-01) assume you can present a challenge reliably. If every client uses a different control panel, DNS provider, or has Cloudflare proxying traffic, your automation will be brittle.

Begin by inventorying each site: hosting provider, control panel, DNS host, whether Cloudflare or similar is proxying, and whether the domain uses DNSSEC. From there, pick a small set of supported patterns for client onboarding. For example:

    Direct-managed DNS with provider API access (Route53, Cloudflare, DigitalOcean). DNS delegated to a subdomain you control for wildcard certs when clients agree. HTTP-01 for straightforward sites where you control port 80 and there is no proxying.

Practical specifics matter. If you plan to use DNS-01 for wildcard certificates, ensure your DNS provider supports API tokens with scoped permission and predictable rate limits. If using HTTP-01, ensure port 80 is open and not redirected in a way that breaks the ACME challenge. If Cloudflare sits in front of a site, either temporarily disable the proxy during validation or use DNS-01 through Cloudflare’s API token.

Thought experiment

Imagine you consolidate DNS for 30 clients to a single API-friendly provider. You reduce complexity, but you also centralize risk: an API outage or a compromised token could affect all clients. Balance this by using per-client API tokens where possible and keeping a small number of DNS accounts rather than one monolith.

Small steps: set TTLs low before DNS changes, add CAA records to prevent other CAs issuing certificates unexpectedly, and use predictable naming for zones so scripts can operate reliably.

3) Choose an ACME client and orchestration model that fits scale and ops discipline

There are several mature ACME clients: Certbot, acme.sh, lego, and vendor-integrated options inside control panels. The right choice depends on your environment. Certbot is widely used and well-documented but can be heavy on servers. acme.sh is lightweight and supports many DNS APIs. If you need a programmatic library, lego is a solid Go option.

At scale, a single ACME client per site is painful. Consider one of these orchestration patterns:

    Central certificate manager: a service that requests certificates and distributes them to servers. This can be homegrown or built on tools like HashiCorp Vault with an ACME plugin. Edge issuance: each host handles its own renewals via a lightweight ACME client, with centralized reporting to detect failures. Containerized renewers: run renewals in containers that consume DNS API tokens and push certificates to storage like an encrypted S3 bucket.

Test against Let’s Encrypt staging endpoints first to avoid rate limits while developing. Speaking of limits, track common restrictions: duplicate certificate limits and certificates-per-registered-domain limits. Implement backoff and retry logic, and keep a local cache of successful certificates so a transient failure doesn’t force you into immediate reissuance attempts.

Commands and examples

For a DNS-01 renewal using acme.sh and Cloudflare API token:

export CF_Token=

acme.sh --issue --dns dns_cf -d example.com -d *.example.com --home /opt/acme

Always run --staging first, then switch to production once flows are stable.

4) Wildcard, SAN, or per-site certificates - pick the right boundary

Certificate structure is an operational decision with security and management consequences. Wildcard certificates (for *.example.com) simplify renewals for many subdomains. SAN certificates can cover a handful of hostnames in one object. Per-site certificates keep blast radius small but increase issuance operations.

Key considerations:

    Blast radius: a leaked private key for a wildcard affects every hostname it covers. For agencies managing multiple clients under a single domain, that risk is material. Renewal frequency: Let’s Encrypt certs last 90 days. Wildcards and SANs still need regular rotation, but the fewer certificates you manage, the fewer operations you’ll run. Operational separation: for client isolation, prefer per-client certificates. For a single client with many subdomains, wildcard simplifies management.

Thought experiment

Imagine a contractor with 40 subdomains under a single client. Using a wildcard certificate reduces renewal events to one per 90 days. Now imagine that private key is discovered in a misplaced backup. The cost of rotating a single wildcard and updating all affected servers may be higher than using separate certs with tighter access controls. Decide based on how you manage keys, backups, and access control.

Practical rule of thumb: prefer per-client isolation. Use wildcards only when you can guarantee tight key custody and quick rotation procedures.

5) Build monitoring, alerting, and a clear recovery playbook

Automation must be observable. A silent renewal failure is worse than a manual process you notice. Build monitoring with these elements:

    Expiry checks: monitor days-until-expiry for every cert and alert at sensible thresholds, for example 30, 14, and 3 days. Synthetic HTTPS checks: regular HTTP requests that validate the certificate chain, hostname match, and TLS handshake using tools like curl or sslscan. ACME operation alerts: log issuance and renewal events centrally and alert on repeated failures or API errors.

Design alerts to avoid noise. An alert should mean “someone needs to act.” If your monitoring spits out low-importance items, tune thresholds and group related alarms. Attach clear runbooks to each alert explaining steps to check DNS, check firewall rules, and how to force a renewal against the staging endpoint for debugging.

Recovery playbook example:

Confirm the failure using a synthetic check and ACME logs. Check DNS and CAA records, verify port 80 and 443 are reachable. Run a staged renewal against Let’s Encrypt staging to avoid rate limits while debugging. If key compromise is suspected, rotate the key and revoke the old certificate, notify affected clients.

Keep a short list of staff who know the runbook and run quarterly drills so the team can execute under time pressure.

6) Handle the tricky edge cases so automation doesn’t break in the wild

At scale you’ll hit edge cases. Expect them and build for them. Common ones include Cloudflare or CDN proxying that intercepts HTTP-01 challenges, DNS providers with slow propagation or API rate limits, and shared hosting that prevents you from binding port 80. Here are practical mitigations:

image

    Cloudflare: prefer DNS-01 using Cloudflare API tokens rather than trying to toggle the proxy off in production. Shared hosting: use DNS-01 or ask the host for a cert upload workflow. If none exists, consider moving high-risk clients to a managed plan that allows proper automation. DNS propagation: lower TTLs before making changes, and avoid large-scale changes during renewals. Rate limits: use the staging endpoint when testing; cache and reuse certificates where safe; stagger renewals to avoid bursts.

More advanced topics to consider: OCSP stapling for faster revocation checks, certificate transparency monitoring to detect unexpected issuance, and HSTS preload consequences if you decide to push sites into preload lists. Each of these adds user-perceived security but also operational requirements.

Debugging checklist for a failing HTTP-01 challenge:

    Is port 80 reachable from the public internet? Test from an external host. Is there a redirect loop sending ACME requests elsewhere? Are host headers correct? Some CDNs rewrite host headers and break validation. Does the server serve the challenge file at /.well-known/acme-challenge/ as expected?

Your 30-Day Action Plan: Move from firefighting to predictable, automated renewals

Use this practical timetable to convert the guidance above into action. Estimate time per item based on scale; smaller portfolios take less time, larger ones more.

Click for info Days 1-3: Inventory. Record host, DNS provider, proxy, and control panel for each site. Flag high-risk clients (shared hosting, Cloudflare, complex DNS). Days 4-7: Choose standard patterns. Decide which DNS providers you’ll support and whether you’ll use DNS-01 or HTTP-01 for each site category. Sign up for staging accounts if needed. Days 8-12: Implement a proof of concept. Use the staging API and issue certs for a handful of noncritical domains. Validate renewals automatically and record the steps. Days 13-18: Build monitoring and alerts. Add expiry checks, synthetic TLS checks, and ACME log aggregation. Create runbooks for the top three failure modes. Days 19-24: Roll out in waves. Start with low-risk clients, then medium, then high. Keep rate limit math in mind and stagger rollouts to avoid throttle. Days 25-30: Drill and document. Run a simulated failure, execute the runbook, and update documentation. Train one backup team member to be able to act.

Checklist before full handoff:

    Staging tests passed for all supported patterns. Monitoring alerts created and tested. Runbooks written and accessible to on-call staff. Key management and backups documented; per-client scoping enforced where needed.

Final note: automation is powerful but not magical. Keep the principles of least privilege, good logging, and regular testing. When your system is predictable and observable, you stop firefighting. Your clients see fewer outages and you get time back to focus on design and feature work.