Wednesday, April 29, 2015

Distributed Cache (repairing it with PowerShell)

* Recently we had issues with our distributed cache system that was set up on are farm quite some time ago when I built it with SPAuto-Installer.  This could have been from rolling out cumulative updates or what have you.  There is very little documentation on the web for this.

*  In our case we had 4 servers (2 web front-ends and 2 application servers)  all with the distributed cache enabled.  Only one server was running the distributed cache.

*  The correct topology for distributed cache is for it to exist on the web front-ends.  So we made some changes to the farm. 

Clean up all 4 Servers using the following commands:

#Stopping the service on local host
Stop-SPDistributedCacheServiceInstance -Graceful

#Removing the service from SharePoint on local host.
Remove-SPDistributedCacheServiceInstance

#Cleanup left over pieces from SharePoint
$instanceName =”SPDistributedCacheService Name=AppFabricCachingService”
$serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq $env:computername}
$serviceInstance.delete()


Then we added the cache host back to WEB01:

#Re-add the server back to the cluster
Add-SPDistributedCacheServiceInstance

We then checked the SPDistributedCacheClientSettings and found that "MaxConnectionsToServer" was set to 16 for all containers.

$DLTC = Get-SPDistributedCacheClientSetting -ContainerType DistributedLogonTokenCache
$DLTC

We used the following script to change  "MaxConnectionsToServer" back to 1 and increase the timeout for each container.

Add-PSSnapin Microsoft.Sharepoint.Powershell

#DistributedLogonTokenCache
$DLTC = Get-SPDistributedCacheClientSetting -ContainerType DistributedLogonTokenCache
$DLTC.MaxConnectionsToServer = 1
$DLTC.requestTimeout = "3000"
$DLTC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedLogonTokenCache -DistributedCacheClientSettings $DLTC

#DistributedViewStateCache
$DVSC = Get-SPDistributedCacheClientSetting -ContainerType DistributedViewStateCache
$DVSC.MaxConnectionsToServer = 1
$DVSC.requestTimeout = "3000"
$DLTC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedViewStateCache $DVSC

#DistributedAccessCache
$DAC = Get-SPDistributedCacheClientSetting -ContainerType DistributedAccessCache
$DAC.MaxConnectionsToServer = 1
$DAC.requestTimeout = "3000"
$DAC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedAccessCache $DAC

#DistributedAccessCache
$DAF = Get-SPDistributedCacheClientSetting -ContainerType DistributedAccessCache
$DAF.MaxConnectionsToServer = 1
$DAF.requestTimeout = "3000"
$DAF.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedActivityFeedCache $DAF

#DistributedActivityFeedLMTCache
$DAFC = Get-SPDistributedCacheClientSetting -ContainerType DistributedActivityFeedLMTCache
$DAFC.MaxConnectionsToServer = 1
$DAFC.requestTimeout = "3000"
$DAFC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedActivityFeedLMTCache $DAFC

#DistributedBouncerCache
$DBC = Get-SPDistributedCacheClientSetting -ContainerType DistributedBouncerCache
$DBC.MaxConnectionsToServer = 1
$DBC.requestTimeout = "3000"
$DBC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedBouncerCache $DBC

#DistributedDefaultCache
$DDC = Get-SPDistributedCacheClientSetting -ContainerType DistributedDefaultCache
$DDC.MaxConnectionsToServer = 1
$DDC.requestTimeout = "3000"
$DDC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedDefaultCache $DDC

#DistributedSearchCache
$DSC = Get-SPDistributedCacheClientSetting -ContainerType DistributedSearchCache
$DSC.MaxConnectionsToServer = 1
$DSC.requestTimeout = "3000"
$DSC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedSearchCache $DSC

#DistributedSecurityTrimmingCache
$DTC = Get-SPDistributedCacheClientSetting -ContainerType DistributedSecurityTrimmingCache
$DTC.MaxConnectionsToServer = 1
$DTC.requestTimeout = "3000"
$DTC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedSecurityTrimmingCache $DTC

#DistributedServerToAppServerAccessTokenCache
$DSTAC = Get-SPDistributedCacheClientSetting -ContainerType DistributedServerToAppServerAccessTokenCache
$DSTAC.MaxConnectionsToServer = 1
$DSTAC.requestTimeout = "3000"
$DSTAC.channelOpenTimeOut = "3000"
Set-SPDistributedCacheClientSetting -ContainerType DistributedServerToAppServerAccessTokenCache $DSTAC 

- We then stopped and restarted Distributed Cache from Central Admin on WEB01

- We then attempted to start "Distributed Cache" on WEB02 and received error "failed to connect to hosts in the cluster"

- Performing a TRACERT from WEB01 to WEB02, we can see a device is in the middle (10.21.1.5).

- Installed Telnet

Import-Module servermanager
Add-WindowsFeature telnet-client


- Telnet from WEB01 to WEB02 on port 22233 and the connection was established.

- We then stopped, cleaned and added WEB02 back to the cache farm

#Stopping the service on local host
Stop-SPDistributedCacheServiceInstance -Graceful

#Removing the service from SharePoint on local host.
Remove-SPDistributedCacheServiceInstance

#Cleanup left over pieces from SharePoint
$instanceName =”SPDistributedCacheService Name=AppFabricCachingService”
$serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq $env:computername}
$serviceInstance.delete()

Then we added the cache host back to WEB02:

#Re-add the server back to the cluster
Add-SPDistributedCacheServiceInstance

This time it started!

- Now we have WEB01 and WEB02 servicing distributed Cache

- We checked the ULS Logs with ULSViewer and found all successful events for Distributed Cache.
Status
=======
Distributed cache is now healthy and in a working state on both WFE Servers.

No comments:

Post a Comment