[SCOM] SQL DB Engine Discovery Mechanism on Cluster

In summary

  • For any SQL DB Engine on cluster to be discovered, firstly ensure the SQL Server Network Name is already discovered in “Agentless Managed”.
  • If not, check if there is any orphaned resource in Windows Cluster.

Details

For SQL DB Engine on Cluster, the discovery is expected to run on 3 objects.

  • SQL Server Network Name (You can find the name in Failover Cluster Manager -> Cluster Name -> Roles -> Server Name)
  • Cluster Name (The cluster name in Failover Cluster Manager)
  • Server Name of the Node

Only the execution on SQL Server Network Name returns instances on it. Execution on the other 2 objects will show successful, but won’t return any discovered instance. That’s expected.

In SCOM ETL Trace:
Execution on 2012CL (SQL Server Network Name):

[1]24232.8364::12/26/2019-18:52:44.977 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6446, completion reason 0x0, task result is SUCCESS, task output is 00{bb187d3c-1218-6f43-b6f6-0a0695b96bd7}{e12d4b62-8f91-0b5e-e583-47789cae70dc}

MachineName2012CL.sqlrepro.eduInstanceNameSQL2012AGDisplayNameResource Pool Group
PrincipalName2012CL.sqlrepro.eduMachineName2012CL.sqlrepro.eduInstanceNameSQL2012AG
MachineName2012CL.sqlrepro.eduNetbiosComputerName2012CLNetbiosDomainNameSQLREPROInstanceNameSQL2012AGDisplayName2012CL\SQL2012AGConnectionString2012CL.sqlrepro.edu\SQL2012AGEditionEnterprise EditionInstanceIDMSSQL11.SQL2012AGLanguage1033Version11.0.2100.60ServiceNameMSSQL$SQL2012AGServiceClusterNameSQL SERVER (SQL2012AG)ClusterTruePerformanceCounterObjectMSSQL$SQL2012AGAuthenticationModeWindows Authentication ModeFullTextSearchServiceNameMSSQLFDLauncher$SQL2012AGFullTextSearchServiceClusterNameSQL Full-text Filter Daemon Launcher (SQL2012AG)AgentNameSQLAgent$SQL2012AGTypeDB EngineAgentClusterNameSQL SERVER AGENT (SQL2012AG)MasterDatabaseLocationT:\MSSQL11.SQL2012AG\MSSQL\DATA\master.mdfMasterDatabaseLogLocationT:\MSSQL11.SQL2012AG\MSSQL\DATA\mastlog.ldfErrorLogLocationT:\MSSQL11.SQL2012AG\MSSQL\Log\ERRORLOGServicePackVersion0AuditLevelFailureInstallPathC:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012AG\MSSQLToolsPathC:\Program Files\Microsoft SQL Server\110\ToolsEnableErrorReportingFalseAccountsqlrepro\sqlsvcPrincipalName2012CL.sqlrepro.eduMonitoringTypeLocal


MachineName2012CL.sqlrepro.eduInstanceNameSQL2012AG
PrincipalName2012CL.sqlrepro.edu
PrincipalName2012CL.sqlrepro.eduMachineName2012CL.sqlrepro.eduInstanceNameSQL2012AG

.

Execution on Cluster2016 (Cluster Name):

[3]24232.8364::12/26/2019-18:53:05.985 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6448, completion reason 0x0, task result is SUCCESS, task output is 00{bb187d3c-1218-6f43-b6f6-0a0695b96bd7}{124369dd-8822-2f6b-9ce7-9a1b06551b24}
# No SQL instance is returned here

.

Execution on Node4 (Server Name of the Node):

[1]24232.10072::12/26/2019-18:51:54.710 [ExecutionManager] [] [Information] :CTaskTracker::TaskCompleted{tasktracker_cpp1514}Notifying creator of task completion for task 6441, completion reason 0x0, task result is SUCCESS, task output is 00{bb187d3c-1218-6f43-b6f6-0a0695b96bd7}{264b2a65-07f0-8cce-8757-b309a2bc20b8}
# No SQL instance is returned here

.

The 3 GUIDs in the trace are the BME IDs of the 3 objects:

If you check SCOM trace and find the discovery didn’t run on SQL Server Network Name, the first thing is to check whether the discovery target is already discovered in SCOM Console -> Monitoring -> Discovered Inventory -> . An object with the SQL Server Network Name is probably missing there.

  • Tracing the discovery chain to the source, if the SQL Server Network Name is already discovered and shows up in Agentless Managed, we need to troubleshoot the SQL discovery itself.
  • Otherwise, if the SQL Server Network Name doesn’t show up in Agentless Managed, the issue is actually a Windows Cluster discovery issue.

For the Windows Cluster discovery issue, I followed below steps to resolve it.

  1. [VERY IMPORTANT] Confirm agent proxy is enabled on the agents.
  2. Check if the cluster has any orphaned resource blocking the cluster discovery.

    a. Run this PowerShell command to check if there is any offline resources that had been removed from cluster manager.

    get-clusterresource | where {$_.state -eq 'offline'}
    

    b. Run this PowerShell command. Compare the result with the resources in Failover Cluster Manager to find out if there is any orphaned resource.

    get-clusterresource | sort-object -Property ResourceType
    

    All the resources returned by the command should be found in below areas in Failover Cluster Manager. Otherwise, there is an orphaned resource.

    If any orphaned resource is found, use this command to remove the resource.

    Remove-ClusterResource -Name "<Name of the resource>" -Force
    

    After that, flush health service cache on the node and monitor for some time. If it goes well, the SQL Server Network Name and the Cluster Name Object should occur in Agentless Managed, then the SQL DB engines should be discovered gradually.

Reference

SCOM Won’t Discover My SQL Server (Or Cluster)
Cluster not appearing in Agentless Managed
Discovery for MS Clusters of Any Kind

Leave a Comment

Your email address will not be published. Required fields are marked *