SwackNet: 2012

Sunday, August 26, 2012

VMworld Hands-On Labs, Follow-Up

Vmworld2012 us live

In case you hadn't heard, VMworld became "VMwait" today as I, along with quite a few other strong-willed geeks, waited well over seven (yes, that's SEVEN) hours before being seated for our first Hands-On Lab (HoL). Despite the hardships sustained by all, including the folks in green shirts running the labs, we all came through it alive and stronger for it. To make it up to us, they decided to stay open until 10pm at which time no new folks could enter but those of us that were there could finish what we started. Proudly, I managed to get three labs in (at least most of them) before heading back to my hotel for the night (sorry v0dgeball and VMunderground, I couldn't make it…maybe next year). Unfortunately, I heard some labs were still having problems even once they got the environment up and running. But luckily for me, I had fairly minimal issues and was able to learn lots!

I want to give a HUGE shout-out to Mr. Irish Spring who did an outstanding job listening to our feedback today and made sure we were supplied with refreshments when we got hungry and kept us informed.

Irish Spring

Also many thanks to Ms. Jennifer Galvin who spent some time chatting with some of us, listening to our (mostly justifiable) grumbling about the experiences of the day.

Jennifer Galvin

In all fairness to VMware, I understand that some of the back-side tech being used this year is different than last year (indeed, some of it isn't even being announced until Monday morning's keynote). They took a risk and ended up having some problems. It's certainly happened to me. I'm betting it has (or will) happen to you.

Hopefully tomorrow will be a better day for everyone involved with the labs.

VMworld Hands-On Labs, First Look

My Sunday here at VMworld began with a good breakfast at a local bakery. I then headed to Moscone West shortly before the Hands-On Labs (HoL) were scheduled to open at 11am and was greeted with this scene:

I was able to navigate through the Traditional HoL crowd to the slightly shorter Bring Your Own Device (BYOD) line, indicated by this nice guy:

After the doors opened, I followed the line inside to the BYOD Check-In Desk. While in line, some very helpful green-shirted VMware folks explained how to prepare our machines for the HoL. After handing over my conference badge to the folks at the table, they entered me into the system and I proceeded to the BYOD Configuration Desk across the room:

I'd decided to take someone else's advice and use my iPad to login to the http://vmwarecloud.com site from the HOL wifi (only available inside the HoL area). That way I can access the lab guide instructions on my tablet and then I could use my MacBook Pro to connect to the lab environment with the View Client. The waiting continued in the "Holding Tank" where I hung out with about 100 other folks waiting for my name to proceed up to the top of this screen:

While waiting, they had a small seating area set up where the folks that wrote the labs were presenting whiteboard sessions:

Once my name reached the top of the screen I headed to the Seating Desk where I obtained my password and access code to login to the HoL site.

With this single-use code in hand I was guided to the BYOD HoL seating area where I set up to do my first lab! Based on what I've heard from previous VMworlds, I think it'll all be worth the wait.

See you on Twitter!

Saturday, August 25, 2012

VMworld 2012, Day 1

I'll admit it. I'm a newbie to VMworld. Yes, this is my first time. But I've been to a few Cisco Live conferences (about 10 I think) so I've been eager to experience VMworld!

I arrived at the San Francisco airport (SFO) this afternoon after a pleasant day of flying from St. Louis, and caught a cab to my hotel. FYI, it was about $50 and 25 minutes to the Westin at 3rd and Market. After checking into the hotel, I walked a couple blocks to Moscone South to check into the conference.

Self Check-In had numerous Dell laptops prompting to enter first and last names. It then found me in the system and re-prompted for my first name (I assume in case I wanted to use a nickname). I hit submit and it told me which line number to stand in to pick up my badge and lanyard.

SECURITY NOTE: I'm concerned that they didn't ask to see my ID when they gave me my badge. This has been standard procedure at Cisco Live for years and I hope they just slipped up with it being the first day.

[Follow-up: I heard back from several folks on the VMworld Help group in the SocialCast conference community site. They ARE supposed to ask for ID and VMworld staff will address this with registration folks.]

With badge in hand (or rather, around my neck), I found out I had to head to Moscone West for Materials Pickup. (Side note: I'm looking forward to getting lots of walking in this week!)

Entering Moscone West you get this scene. Note that Self Check-in is available in Moscone South AND West locations.

I decided to check out the wireless so connected to the "VMworld 2012" SSID on my iPhone. I tried browsing somewhere as a test, and a splash page popped up with a vendor advertisement (yawn) with a countdown timer ("5 seconds until launch"). However, the first time I tried it showed me this weird login screen:

I walked over to the Technical Support desk near the Southeast entrance and asked about it. They indicated that it acted weird like that and I should try it again. When I did, voila, I got connected and it immediately directed me to the VMworld Mobile website (which I had already logged into earlier in the day).

Having completed my mission for the day, I headed back to the hotel to do the "Backpack Unboxing" and take some photos, shared for you below. I'm very excited to be here and grateful for the beautiful weather (it was about 67F when I arrived this afternoon).

Hit me up on Twitter (@swackhap) with your comments or questions, or leave a note below.

Wednesday, August 8, 2012

Unifying Wired and Wireless Edge with Aruba Tunneled Nodes

Anyone familiar with modern lightweight access points (APs) knows and understands the basics: Client connects to AP, AP tunnels traffic back to controller, and administrators can specify all sorts of useful policies in the controller. Aruba Networks has taken this concept of the wireless edge and extended it to the wired edge of the network with their Tunneled Nodes and Mobility Access Switches. The company I work for has very old closet switches and, since we're pretty heavily invested in Aruba wireless, I'm intrigued by the concept of unifying wired and wireless edges.

With a sample switch acquired from my account team, I spent a couple hours with my SE getting the basic introduction to Aruba's Ethernet switches. The goal of the session was to get the switch set up as a "wired AP" connected to a local controller, and when a laptop would connect to a particular port, the switch would then build a GRE tunnel to the local controller where the laptop's traffic would get dumped out onto the specified VLAN. Unfortunately, we weren't able to complete the setup, so my SE and I agreed to engage the TAC for further assistance.

My experience with the TAC was less than stellar this time around, but I believe it was mostly due to how new this technology is and that many TAC engineers haven't had time to learn it inside and out yet. Eventually I was able to reach an engineer that could identify a fix, and it turned out to be fairly simple. In addition, a high-level support supervisor called me personally to apologize and really listened to my recommendations for how to improve service.

Before the big reveal, here are the technical details of the setup.

We used a test laptop connected to port 2 of the Aruba switch, which was uplinked to a Cisco switch at my desk via an access-port on vlan 221. That Cisco switch was connected through a trunked 802.1q LAN to the local controller. See the diagram for a topology overview.

When we first set things up, the tunneled-node (a.k.a. the laptop in this case) showed a state of “in-progress” (see output of “show tunneled-node state” command) and would never get to the “complete” state.

In problem state:

(ArubaS3500) #show tunneled-node state

Tunneled Node State

-------------------

IP MAC Port state vlan tunnel inactive-time

-- --- ---- ----- ---- ------ -------------

10.20.20.125 00:1a:1e:10:fb:c0 GE0/0/1 in-progress 0221 4094 0000

Here are the most important parts of the configurations of the switch and controller below.

Switch:

ip-profile

default-gateway 10.22.16.1

controller-ip vlan 221

vlan "221"

interface-profile switching-profile "vlan221"

access-vlan 221

interface-profile tunneled-node-profile "tunnel-local-controller"

controller-ip 10.20.20.125

backup-controller-ip 10.20.20.123

interface gigabitethernet "0/0/1"

switching-profile "vlan221"

interface gigabitethernet "0/0/2"

tunneled-node-profile "tunnel-local-controller"

switching-profile "vlan221"

interface vlan "221"

ip address 10.22.17.200 netmask 255.255.240.0

Local Controller:

vlan 220 "Backbone"

vlan 221 wired aaa-profile "s3500aaa"

interface vlan 220

ip address 10.20.20.125 255.255.255.0

tunneled-node-address 10.20.20.125

aaa profile "s3500aaa"

initial-role "authenticated"

aaa authentication wired

profile "s3500aaa"

The core problem ended up being the “tunneled-node-address” command on the controller. We had set it as the IP address of the controller itself, but the TAC identified this as the problem and changed it to all-zeros, like this:

tunneled-node-address 0.0.0.0

Finally, the tunneled-node came up in the “complete” state (see output below) and I was able to get a DHCP address on the laptop and connect to the rest of the network.

When problem was fixed:

(ArubaS3500) #show tunneled-node state

Tunneled Node State

-------------------

IP MAC Port state vlan tunnel inactive-time

-- --- ---- ----- ---- ------ -------------

10.20.20.125 00:1a:1e:10:fb:c0 GE0/0/2 complete 0221 4094 0000

Hit me up on Twitter (@swackhap) or leave your feedback below.

Monday, July 16, 2012

Automating Exchange Bandwidth Limits with SolarWinds Orion NCM

As I've written about previously, one of the many tools I work with is SolarWinds Orion Network Configuration Manager (NCM). It's a great tool to capture device configs on a daily basis, and for scheduling off-hours changes or regularly scheduled processes that may happen weekly, daily, or even multiple times per day.

Recently our messaging team started replicating Microsoft Exchange data stores from our primary datacenter in the US to another location in the Far East (FE). In this case, there's only a 4.5 Mbps circuit connecting the locations, and the replication traffic started interfering with production traffic in the FE. With QoS on the link and Riverbed Steelheads optimizing the traffic like nobody's business, we still needed to so something. The decision was made to cap the Exchange Replication (henceforth referred to as EXREPL) traffic.

Using the Steelhead's Advanced QoS configuration we set the upper bandwidth (BW) % to 33% of the 4.5Mbps link (see below).

But we only needed to keep this limit in effect during the local daytime, and at other times we can let more EXREPL traffic through. Despite the beautiful Web GUI that Riverbed uses, there's also an excellent CLI interface. The question became "What commands can I use to modify the upper BW% for the EXREPL QoS class?" After a bit of reading through the CLI Guide, I found the proper format:

qos classification class modify class-name "EXREPL" upper-limit-pct 33
qos classification class modify class-name "EXREPL" upper-limit-pct 90

Rather than sit at the keyboard and execute these commands twice per day, I set up SolarWinds Orion NCM jobs on schedule to run the following:

M,T,W,Th,F at 7am CT

config t

qos classification class modify class-name "EXREPL" upper-limit-pct 33

end

write mem

Su,M,T,W,Th at 6pm CT

config t

qos classification class modify class-name "EXREPL" upper-limit-pct 90

end

write mem

Each scheduled job also fires off e-mail alerts to the network and messaging teams to keep everyone in the loop. For small teams like mine, this tool is invaluable in it's flexibility. Now twice a day, like clockwork, NCM happily does it's job and lets us know if it succeeded or had problems. Another crisis averted!

What kind of simple (or complex) automation do you use? Hit me up on Twitter (@swackhap) or post a comment below.

Tuesday, June 26, 2012

Network Disruption Causes vCenter DB Corruption

First off, I am NOT a VMware expert by any stretch of the imagination. I AM however learning a lot working with some smart folks in virtualized servers and desktops.

A network engineer (who shall remain nameless) was making some changes to the network infrastructure last night and unfortunately experienced an outage. Due to an ongoing network migration from Cat6500 to Nexus 7k/5k/2k, all ESX hosts are now connected to Nexus FEX but iSCSI storage is still on old Cat6500. Outage basically cut connectivity between Nexus-connected hosts and iSCSI storage.

As users started trying to login to their desktops in the morning, we started getting reports of problems. Our VDI vCenter showed 4 of our 20+ hosts disconnected or not responding. We ended up power-cycling those, one at a time, and once they came up we were able to re-connect them back into vCenter.

The next big problem was that the profile server, which runs as a VM in the VDI infrastructure, was hung while attempting to migrate. We rebooted vCenter which orphaned the profile server, but we found we were unable to browse the particular LUN where that VM's datastore existed to add it back into vCenter. At that point, we engaged VMware support and spent several hours on WebEx troubleshooting storage connectivity problems (tail -f /var/log/vmkernel and some other stuff). By the time I left in the early afternoon we had identified half a dozen hosts that seemed to be having iSCSI problems based on what VMware Support was seeing in the logs, and we rebooted those hosts one at a time to minimize end-user impact.

I had to leave before the fun was all over, but found out afterwards that apparently a couple of the hosts got duplicates of the datastore IDs on them when they recovered from the outage overnight. Once that happened, the database was somehow corrupted with the wrong datastore information. It was apparently cleared by removing the two particular hosts from vCenter and adding them back in, thus giving them new datastore information.

Like I said, I'm not a VMware expert but I'm learning more each day. You ever experience something like this? Who else is doing VDI? Leave your comments below or find me on Twitter (@swackhap).

Saturday, June 23, 2012

Cisco Live Tips and Tricks

Hard to believe it's been over a year since my last post here. As I've learned in life though, sometimes you have to forgive yourself for your failings (in this case, not blogging for a while) and then you can continue to improve on yourself.

I recently attended Cisco Live 2012 in San Diego. After attending 9 times (thereabouts), I figured I'd share some ideas/thoughts/tips.

First off, have a 10-foot extension cord when traveling and when attending sessions. Many breakout sessions and labs are in rooms that have power strips available, but some do not. If your extension cord has a 3-prong plug, have a 3-prong to 2-prong adapter with you just in case you need to plug into an old outlet.

The World of Solutions (WoS) is the area where Cisco and their partners set up booths with all sorts of goodies. The first night it may be okay to wander a bit, but at some point you need to HAVE A PLAN. Look over the list of exhibitors. Think about your goals for the conference. Are there particular problems at work that you're trying to solve? The WoS is THE PLACE to find the solution. Print a map of the booths and circle the ones you want to visit. Then cross them off after you've been there. Stay focused!

Some of my favorite places in the World of Solutions:

Walk-In Hands-On Labs - Great place to spend a few minutes learning new skills and practicing configurations on a plethora of systems.
Cisco Booth - Incredible opportunity to learn about almost every product/system/solution that they sell.
Social Media Hub - For the first time this year, the folks behind all the social networking for the event, such as the @CiscoLive Twitter account, set up shop to show off the top Tweeters and give people a place to lounge for a bit.
Technical Solutions Clinic - Basically an engineer's Heaven-on-Earth, there are several dozen whiteboards surrounded by some of Cisco's smartest Technical Marketing Engineers and TAC folks. What problem did you have at work you've been trying to fix? They'll solve it for you.

The Cisco Live mobile app makes navigating the conference a snap. View your schedule of sessions, browse WoS exhibitor listings and conference maps, and complete evaluations of sessions you've attended, right on your phone or tablet. The evaluations are incredibly important and Cisco takes them very seriously.

I'm very excited to have attended Cisco Live once again, and hope to continue doing so. I consider a week at Cisco Live equivalent to about 3 weeks worth of training.

If you have any questions, comment below or hit me up on Twitter (@swackhap). Cheers!