Getting more than expected from a virtual-server training exercise

Sometimes it’s enough to fix an IT problem without discovering the root cause.

virtual data center servers
Henrik5000 / Getty Images

During a recent training exercise in a non-production environment, I built a Cisco ISE virtual server using VMware vSphere and succeeded troubleshooting an issue, which demonstrates the value of this type of exercise. It also shows how important it is for network engineers to have clear priorities and keep their eye on the goals set for the task at hand.

In this exercise, the build of the virtual server gave me the option of using one of two datastores that we’ll call Datastore One and Datastore Two. It also provided the option of choosing from multiple ESXI host machines to launch the virtual server on, and we’ll designate them with letters such as Host A, Host B, etc. Some of the hosts could associate only with Datastore One, and the rest could associate only with Datastore Two.

For the sake of discussion, let’s say I initially chose Host A and Datastore One to create the virtual server. After it was built, I needed to perform a backup and restart the virtual server. The backup worked, but the restart failed, and I received two error notifications:

  • Power On Virtual Machine “The operation is not allowed in the current state”
  • Power on Virtual Machine “A General System Error Occurred: BPM error occurred during Pre Migrate check callback: Connection refused”

Troubleshooting

The virtual server provides three options that might address these errors: 1) change the host; 2) change the datastore; 3) change both. I tried option one, changing to a different host (call it Host B) but keeping the same datastore, Datastore 1. That didn’t solve the problem. I tried option two, using the original host (Host A) with Datastore Two, but that didn’t work either. So I tried option three, using a different host from the original and a different datastore—a paring of Host B with Datastore Two. The virtual server powered on, and it restarted with no error codes.

That solved the restart problem, but for the purposes of the training exercise, the virtual server had to be associated with Datastore One, not Datastore Two. So I chose a third host (Host C) that was compatible with Datastore One, and built the virtual server on that, associating it with Datastore One. That virtual server also worked, backed up successfully, and restarted with no error codes. Mission accomplished.

Lessons learned

While I would like to know why I received those error codes, I never fully discovered the reason. I did discuss it with a colleague, though, who made an interesting suggestion: Because the virtual server was being built in a training environment where multiple colleagues were performing the same exercise, maybe something about building, removing, and rebuilding the virtual server over and over on the same machines caused the problem.

It’s a theory, but not answer, and that’s OK. As IT professionals, part of our job is to troubleshoot known and unknown issues and figure out how to get the desired outcome even if details of the underlying issue that blocks our way remains unknown. In this case we scored a success because we got the outcome we needed—setting up the Cisco ISE virtual server using Datastore One. Identifying the root cause would be nice but can wait for another day. Meanwhile, the training exercise prepared the team to go live with the virtual servers and gave us a puzzle we managed to solve.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2021 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)