In summary
- For any SQL DB Engine on cluster to be discovered, firstly ensure the SQL Server Network Name is already discovered in “Agentless Managed”.
- If not, check if there is any orphaned resource in Windows Cluster.
Details
For SQL DB Engine on Cluster, the discovery is expected to run on 3 objects.
- SQL Server Network Name (You can find the name in Failover Cluster Manager -> Cluster Name -> Roles -> Server Name)
- Cluster Name (The cluster name in Failover Cluster Manager)
- Server Name of the Node
Only the execution on SQL Server Network Name returns instances on it. Execution on the other 2 objects will show successful, but won’t return any discovered instance. That’s expected.
In SCOM ETL Trace:
Execution on 2012CL (SQL Server Network Name):
[1]24232.8364::12/26/2019-18:52:44.977 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6446, completion reason 0x0, task result is SUCCESS, task output is
. 0 0 {bb187d3c-1218-6f43-b6f6-0a0695b96bd7} {e12d4b62-8f91-0b5e-e583-47789cae70dc}
MachineName 2012CL.sqlrepro.edu InstanceName SQL2012AG DisplayName Resource Pool Group
PrincipalName 2012CL.sqlrepro.edu MachineName 2012CL.sqlrepro.edu InstanceName SQL2012AG
MachineName 2012CL.sqlrepro.edu NetbiosComputerName 2012CL NetbiosDomainName SQLREPRO InstanceName SQL2012AG DisplayName 2012CL\SQL2012AG ConnectionString 2012CL.sqlrepro.edu\SQL2012AG Edition Enterprise Edition InstanceID MSSQL11.SQL2012AG Language 1033 Version 11.0.2100.60 ServiceName MSSQL$SQL2012AG ServiceClusterName SQL SERVER (SQL2012AG) Cluster True PerformanceCounterObject MSSQL$SQL2012AG AuthenticationMode Windows Authentication Mode FullTextSearchServiceName MSSQLFDLauncher$SQL2012AG FullTextSearchServiceClusterName SQL Full-text Filter Daemon Launcher (SQL2012AG) AgentName SQLAgent$SQL2012AG Type DB Engine AgentClusterName SQL SERVER AGENT (SQL2012AG) MasterDatabaseLocation T:\MSSQL11.SQL2012AG\MSSQL\DATA\master.mdf MasterDatabaseLogLocation T:\MSSQL11.SQL2012AG\MSSQL\DATA\mastlog.ldf ErrorLogLocation T:\MSSQL11.SQL2012AG\MSSQL\Log\ERRORLOG ServicePackVersion 0 AuditLevel Failure InstallPath C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012AG\MSSQL ToolsPath C:\Program Files\Microsoft SQL Server\110\Tools EnableErrorReporting False Account sqlrepro\sqlsvc PrincipalName 2012CL.sqlrepro.edu MonitoringType Local
MachineName 2012CL.sqlrepro.edu InstanceName SQL2012AG
PrincipalName 2012CL.sqlrepro.edu
PrincipalName 2012CL.sqlrepro.edu MachineName 2012CL.sqlrepro.edu InstanceName SQL2012AG
Execution on Cluster2016 (Cluster Name):
[3]24232.8364::12/26/2019-18:53:05.985 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6448, completion reason 0x0, task result is SUCCESS, task output is
. 0 0 {bb187d3c-1218-6f43-b6f6-0a0695b96bd7} {124369dd-8822-2f6b-9ce7-9a1b06551b24}
# No SQL instance is returned here
Execution on Node4 (Server Name of the Node):
[1]24232.10072::12/26/2019-18:51:54.710 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6441, completion reason 0x0, task result is SUCCESS, task output is
. 0 0 {bb187d3c-1218-6f43-b6f6-0a0695b96bd7} {264b2a65-07f0-8cce-8757-b309a2bc20b8}
# No SQL instance is returned here
The 3 GUIDs in the trace are the BME IDs of the 3 objects:
If you check SCOM trace and find the discovery didn’t run on SQL Server Network Name, the first thing is to check whether the discovery target is already discovered in SCOM Console -> Monitoring -> Discovered Inventory ->
- Tracing the discovery chain to the source, if the SQL Server Network Name is already discovered and shows up in Agentless Managed, we need to troubleshoot the SQL discovery itself.
- Otherwise, if the SQL Server Network Name doesn’t show up in Agentless Managed, the issue is actually a Windows Cluster discovery issue.
For the Windows Cluster discovery issue, I followed below steps to resolve it.
- [VERY IMPORTANT] Confirm agent proxy is enabled on the agents.
- Check if the cluster has any orphaned resource blocking the cluster discovery.
a. Run this PowerShell command to check if there is any offline resources that had been removed from cluster manager.
get-clusterresource | where {$_.state -eq 'offline'}
b. Run this PowerShell command. Compare the result with the resources in Failover Cluster Manager to find out if there is any orphaned resource.
get-clusterresource | sort-object -Property ResourceType
All the resources returned by the command should be found in below areas in Failover Cluster Manager. Otherwise, there is an orphaned resource.
If any orphaned resource is found, use this command to remove the resource.
Remove-ClusterResource -Name "<Name of the resource>" -Force
After that, flush health service cache on the node and monitor for some time. If it goes well, the SQL Server Network Name and the Cluster Name Object should occur in Agentless Managed, then the SQL DB engines should be discovered gradually.
Reference
SCOM Won’t Discover My SQL Server (Or Cluster)
Cluster not appearing in Agentless Managed
Discovery for MS Clusters of Any Kind