[Audio] Healing a USA-DC-01 at G-V-L Fixing Legacy Active Directory at U-S-A Production Parts (usaproduction.local).
[Audio] Definitions NTFRS, legacy replication service that keeps server info identical by replicating all S-Y-S files. D-F-S-R-, default replication service introduced Server 2008. Replicates only changed S-Y-S files SYSVOL, shared directory on each DC that stores public files needed by clients, most notably Group Policy and login scripts. netlogon, stores and provides access to login scripts, user profiles & system policies STALE OBJECTS, unused or inactive accounts and computer objects that have a security and performance impact on your environment. SCHEMA, blueprint that defines all attributes that can be created in Active Directory. metadata, data about computers & users replication details, health & history.
[Audio] Environment Overview Single forest, single domain: usaproduction.local Gainesville (G-V-L--) – USA-DC-01 (Windows Server 2022) Branch: Hartwell – New remote domain controller over site to site V-P-N Roles: AD DS, D-N-S--, Group Policy, file/print, authentication.
[Audio] Long Upgrade Path GVL Domain originated as SBS 2003 (USAMAIN) In place upgrades or migrations: 2003 R2 → 2008 → 2012 → 2012 R2 → 2016 → 2022 New DCs added over time, old DCs not always properly demoted Result: stale objects, replication partners, Schema extensions missing, and legacy SYSVOL configuration.
[Audio] Symptom 1 – Missing SYSVOL / netlogon On Hartwell DC, running net share showed no SYSVOL or netlogon Without these shares, Group Policy and logon scripts do not apply Hartwell users experience inconsistent logons and G-P-O enforcement Indicates SYSVOL replication and/or DFSR/FRS configuration problems.
[Audio] Symptom 2 – Replication Errors & Ghost usamain dcdiag and repadmin /showrepl reported replication issues Active Directory still referenced legacy DC “USAMAIN” (S-B-S--) & ghost entries in global catalog Errors such as: server cannot be found for 780 plus days Replication topology included non existent partners, blocking healthy replication.
[Audio] Symptom 3 – F-R-S vs D-F-S-R Confusion Environment should be using D-F-S-R for SYSVOL on modern Windows Server F-R-S service still generating events and deprecation warnings DFSRMIG reported global state 'Eliminated', but config was inconsistent D-F-S-R objects (replication group, members) incomplete or missing.
[Audio] Root Cause Long term, multi step upgrade path without full cleanup The domain was upgraded piecemeal from SBS 2003 → 2003 R2 → 2008 → 2012 → 2012 R2 → 2016 → 2022, but prior DCs (especially SBS/USAMAIN and intermediate 2012 DCs) were not fully demoted and cleaned from AD. Ghost IP addresses existed in Global Catalog. This left stale replication and SYSVOL configuration (FRS-based) baked into the directory..
[Audio] Root Cause – Cont. FRS-based SYSVOL persisted into modern Server versions SYSVOL remained on F-R-S far past the point where it should have been migrated. F-R-S was throwing deprecation warnings but still driving SYSVOL on core DCs (for example, USA-DC-01), even as newer OS versions were introduced..
[Audio] SCHEMA Extensions D-F-S-R extensions were completely missing. This more than likely happened at the first or second upgrade/migration..
[Audio] Corrupted / incomplete D-F-S-R metadata in AD DFSR AD objects (for example, the Domain System Volume replication group) were present but missing attributes such as rg_guid_hex. Actual G-U-I-D's that are supposed to be present within D-F-S-R were unavailable or Null. Attempts to add a D-F-S-R member for USA-DC-03 failed because the underlying AD structure was invalid. This suggests earlier, incomplete D-F-S-R migration or manual AD edits during the upgrade path..
[Audio] D-F-S Replication not installed properly on all DCs (critical for 2022) In the final 2022 environment, USA-DC-03 did not have all the D-F-S Replication roles installed, yet the domain had fully moved SYSVOL to D-F-S-R (dfsrmig Global state "Eliminated"). This guaranteed: No listener on TCP 5722 on DC-03. No DFSR SYSVOL replication to/from Hartwell. A temporary decision was made to use F-R-S to see if SYSVOL/NETLOGON would transfer. SYSVOL started and then quit. D-F-S health tools and D-F-S Management continued to fail for DC-03..
[Audio] Network rules lagged behind AD evolution SonicWall and Windows Firewall configurations were not initially aligned with the required port set for D-F-S-R and R-P-C--: D-F-S-R (5722) was not open and would not open, so a static port was configured instead. Static D-F-S port 38904 worked only one way (Hartwell → GVL). This made it harder to distinguish pure configuration issues from true role/metadata problems..
[Audio] Remediation Strategy – High Level 1. Stabilize GVL DC (USA-DC-01) as the reference domain controller 2. Perform metadata cleanup of dead domain controllers (for example, usamain) & all ghost D-N-S entries 3. Repair SYSVOL replication and D-F-S-R configuration 4. Rebuild broken Hart DC cleanly when necessary 5. Fix AD sites/subnets and firewall ports to support healthy replication.
[Audio] Step 1 – Stabilize HQ DC (USA-DC-01) Validate AD health with dcdiag /v Check replication status with repadmin /showrepl /errorsonly Confirm SYSVOL and netlogon exist via net share Confirm D-N-S health and F-S-M-O roles are all on USA-DC-01 Use this DC as the authoritative source for the domain.
[Audio] Step 2 – Metadata Cleanup of Legacy DCs Identify retired DCs (for example, usamain, older intermediates) in AD Use Active Directory Sites and Services and NTDSUTIL for cleanup Remove stale N-T-D-S settings and server objects for non existent DCs Clean up ghost D-N-S records in _msdcs and S-R-V zones. End state: only live DCs appear in topology and D-N-S.
[Audio] Step 3 – SYSVOL / D-F-S-R Repair Verify DFSRMIG global state and per DC migration status Validate D-F-S-R configuration objects: DFSR-GlobalSettings, Domain System Volume group, member objects Align AD objects with actual servers in use Ensure all domain controllers participate correctly in DFSR-based SYSVOL replication Monitor D-F-S-R event logs for clean replication.
[Audio] Step 4 – Rebuild Hartwell DC If SYSVOL/NETLOGON cannot be safely repaired, demote the branch DC Clean up any remaining references to the old instance in AD and D-N-S Rejoin server to domain as a member, then promote to DC again Specify USA-DC-01 as the replication partner during promotion Confirm SYSVOL and netlogon shares appear and replication is healthy.
[Audio] Step 5 – Sites, Subnets & Firewall Ports Map G-V-L subnet (for example, 192.168.1.0/24) to Site: Default First Time Map Hartwell subnet (for example, 192.168.9.0/24) to Site: HARTWELL Ensure each DC resides in the correct site and subnet association On SonicWall, allow required DC ports: L-D-A-P-, Kerberos, R-P-C--, D-N-S--, D-F-S-R-, et cetera Verify site to site V-P-N stability and latency for replication.
[Audio] Validation & Results dcdiag and repadmin show clean replication with no errors SYSVOL and netlogon present and accessible on all DCs Test all ports and access rules between sites bidirectional necessary for domain controllers to communicate. No remaining references to usamain or 700 plus day replication gaps D-F-S-R events indicate healthy SYSVOL replication across sites Branch users receive G-P-O's and authentication from local DC reliably.
[Audio] Outcome Always demote domain controllers properly—never just power off or repurpose Complete F-R-S to DFSR SYSVOL migrations and verify the end state, don’t assume Regularly run dcdiag, repadmin, and review event logs after major changes Treat Active Directory as a critical application, not just another server Document upgrade paths, cleanup steps, and validation tests for future work.
[Audio] Business Impact Logon & Authentication Symptoms What users see: Very slow logons: “Applying computer settings…” or “Please wait for the User Profile Service” hanging for 30–120 plus seconds, especially at the remote site when the local DC is broken and clients fall back across the V-P-N--. Intermittent “no logon server” errors: “There are currently no logon servers available to service the logon request” when the local DC can’t serve SYSVOL/NetLogon and the W-A-N link or alternate DC isn’t reachable. Random credential prompts when opening file shares, intranet sites, or line of business apps that normally use S-S-O--. Machines acting like they’re off domain (cannot change password, can’t join new PCs to the domain from the affected site, or domain join times out)..
[Audio] Group Policy Not Applying (or Applying Inconsistently) When SYSVOL replication is broken, G-P-O's are not consistent. Typical symptoms include: Mapped drives are missing or wrong: some users get their usual H: / S: / department shares, others don’t; drives may appear and then disappear after a reboot. Printers come and go: GPO-deployed printers vanish or change unexpectedly. New policies don’t stick: changes work at HQ (USA-DC-01) but don’t appear at Hartwell (DC-03/05), or appear only on some machines. Security policies inconsistent: password complexity/lockout rules and firewall baselines differ between machines. Logon scripts not running: \\domain\SYSVOL or \\domain\NETLOGON scripts fail or are missing at the remote site, causing users to lose drive mappings or environment variables..
[Audio] File/Share & SYSVOL Access Issues “Path not found” for domain based paths such as \\domain.local\SYSVOL or \\domain.local\NETLOGON; these may be intermittently unavailable from certain sites. Login scripts that are hard coded to SYSVOL/NETLOGON paths fail silently or generate errors. D-F-S Namespace weirdness (if used): some file shares under a D-F-S namespace (for example, \\domain\files\…) work for HQ users but not for branch users; Hartwell users may get old targets or “folder not available” messages because referrals rely on broken DCs..
[Audio] DNS & Name Resolution Oddities Replication/DNS issues tied to broken DCs will show up as: “Server not found” / can only connect by IP: users can ping a server by IP but not by name, and behavior is intermittent depending on which DC/DNS they query. Different name resolution in different offices: HQ resolves a server name one way, while the remote site resolves it differently or not at all. Internal website (for example, intranet.company.com) loads in one office but fails in another. Autodiscover / email profile problems (if on prem or hybrid): Outlook repeatedly prompts for credentials or cannot connect due to DNS/SRV replication inconsistencies..
[Audio] Remote Site–Specific Symptoms (Hartwell / DC-03 / DC-05) Given the current layout, Hartwell is where the issues are most visible: Logons at Hartwell are slow or fail when V-P-N is flaky: the local DC (DC-03 or DC-05) is not a healthy SYSVOL/DFSR participant, so clients must hit GVL DC-01 across the V-P-N--. If V-P-N is down or congested, logons fail or take several minutes. GPOs “stuck in time”: Hartwell PCs may keep old G-P-O versions from the last time replication worked and never receive new policies from G-V-L--. Inconsistent trust in the domain: sporadic “The trust relationship between this workstation and the primary domain failed” errors if secure channels reset and DCs disagree..
[Audio] Application & Service Level Symptoms Any AD-integrated application can expose these issues: Line of business apps can’t authenticate: app servers at G-V-L are fine, but users at the remote site get denied or time out depending on which DC handles authentication. Service accounts authenticated against a broken DC fail, while those hitting a healthy DC succeed. V-P-N authentication / S-S-O issues: if SonicWall uses LDAP/LDAPS against a DC with broken replication or D-N-S--, V-P-N logins may fail intermittently. Scheduled tasks and services failing silently: services that run under domain accounts (backups, scanning software, printers, et cetera) may fail because they cannot reach a healthy DC with up to date credentials and G-P-O rights..