Linux Infrastructure Management Part-V

Solution Delivery

Well, we have a look into the problem and how to tackle them, but providing the solution to the external and as well internal stakeholders needs a different set of jobs. And ultimately that counts a lot about team performance measured by some “Black Suit” wearing fellow sitting on top.

In case, if you have missed previous articles of this series, here they are :

Part-One , Part-Two , Part-Three and Part-Four

But from a technical standpoint of view, we have to make sure everything is in place to make the solution will be sustained. And we might use it in the future (as I mentioned the importance of Wiki et al. in part 3 of information management). We need to organize every solution like it should look like a KB in place. It will save time and headaches later to do something useful later. Certainly, help to automate things and make an enhancement lot easier.

So, different enterprise approaches to problem-solving in different ways, and few of them stand out. The solution should consist of a link to the wiki or whatever or wherever we post the steps and outcomes of that particular problem. Say, the infrastructure has a perennial problem with httpd, and it has been solved with some monumental efforts. That effort should not go waste, and should be in a proper format to the concerned persons (the black suit-wearing) jerk! I have had a habit of almost everything copied from the terminal (yeah I know the better method of doing a thing)and at least pasting on credit or some other note-taking place, if not another vim. So, later I can sort and filter out the absolutely necessary information(processing of raw data)and put it into a wiki or document space.

Solutions should absolutely clear and precise. Having said that I meant, it should have a step-by-step approach whenever possible. If and only if required should go into the details explaining the cause in a small vignette. Yes, you got it right, need to write a clear SOP for all the problems we faced and solved. Those SOPs get exchanged with guys up in the verticals. But do those guys care about it ?? NOOOO, they don’t. They only care about the matrix, and they need that to assess the delivery (oh no!), yep that’s the way.

I am just providing a SOP here for reference (yeah I know few folks can write even better)

Under Monitoring Standard:

Software: Nagios

Threshold: Max and Min value

Service Under Investigation : Httpd/Apache/Apache2

Restart Procedure :

/sbin/service httpd graceful

/sbin/apachectl graceful

Changes: With date appended with the original file along with Eng name in the same dir.

Check before Bounce: Run “pmap” on the httpd process to find out the memory maps. And must run “ipcs” to find out the resource map for the httpd process.

If and only required to kill httpd …please use “ipcrm” to killed out shared resources then go for “restart”.

Resolution Aspects:

  1. Config file test for httpd service

A: /usr/sbin/apachectl configtest

  1. Checking log files considered to be a good habit.

A: tailf /var/log/httpd/access_log and tailf /var/log/httpd/error_log

Alternatively, check for sites specific log in specific domain dir log.

Resolving: High Apache/Httpd Memory Usage

Apache can be a big memory user. Apache runs a number of ‘servers’ and shares incoming requests among them. The memory used by each server grows, especially when the web page being returned by that server includes PHP or Perl that needs to load in new libraries. It is common for each server process to use as much as 10% of a server’s memory. To reduce the number of servers, you can edit your httpd.conf file. There are three settings to tweak: StartServers, MinSpareServers, and MaxSpareServers.Each can be reduced to a value of 1 or 2 and your server will still respond promptly, even on quite busy sites. Some distros have multiple versions of these settings depending on which process model Apache is using. In this case, the ‘prefork’ values are the ones that would need to change. To get a rough idea of how to set the MaxClients directive, it is best to find out how much memory the largest apache thread is using. Then stop apache, check the free memory, and divide that amount by the size of the apache thread found earlier. The result will be a rough guideline that can be used to further tune (up/down) the MaxClients directive

Setting “MaxRequestPerChild” to a non-zero limit will solve some memory leakage problems, but that has to be supplied judiciously.

Say, What the MPM the server is running and how will serve the content will dictate the term for setting parameters in the main config file.

How to find ALL the virtual host in the shared box:

/sbin/httpd -S

How to find ALL the modules at once:

/sbin/httpd -M

How to find the MPM on running server?

/sbin/httpd -V

Tips to overcome bottleneck:

  1. It’s best to set StartServers and MinSpareServers to high numbers so that if you get a high load just after the server has been restarted, the fresh servers will be ready to serve requests immediately.
  2. MinSpareServers and MaxSpareServers to similar (or even the same) values.
  3. Having MaxSpareServers close to MaxClients will completely use all of your resources (if MaxClients has been chosen to take full advantage of the resources) and make sure that at any given moment your system will be capable of responding to requests with the maximum speed (assuming that the number of concurrent requests is not higher than MaxClients—otherwise, some requests will be put on hold).
  4. Try to run the benchmark with the apache inbuilt tool. It’s not sufficient ..but gives you enough details to see where it is lagging.
  5. The status and Info page should be enabled.

Checkpoints too:

  1. Please check the docroot for changes i.e file permission change, modified file, relocation of the file inside the document root.
  2. Please check whether httpd/Apache process is alive or not.
  3. The process by tools( top, vmstat)
  4. Find out the file open by httpd process by using system tool like i,e lsof
  5. Check how the user is accessing those files by a tool i.e fuser .
  6. check the website in concern with some external site i.e http://www.downforeveryoneorjustme.com/
  7. If the site has ssl connection check it by s_client from the cli to make sure it responds properly.

8)Check the website landing page response time with the below script:

#!/bin/bash
CURL="/usr/bin/curl"
GAWK="/usr/bin/gawk"
echo -n "Please pass the url you want to measure:  "
read url
URL="$url"
result=`$CURL -o /dev/null -s -w %{time_connect}:%{time_starttransfer}:%{time_total} $URL`
echo " Time_Connect     Time_startTransfer   Time_total "
echo $result | $GAWK -F: '{ print    $1"               "$2"                   "$3}'

I would like to record the failure too!!. So I can get a better insight into problem-solving. You can capture it several ways in the server itself and then send the file by mail or a central location from where you can audit. One of the ready-made tools will be the UNIX tool i.e. script. It will record everything you typed at the terminal and create a file in the respective user home directory (although you can tweak it). There are many more “Enterprise ways” of doing that thing …which I am not too inclined to discuss.

RCA is a big buzzword in a lesser IT-driven company …I have worked for them, and I know how pathetically they wrote those, full of nonsense stories; kinda red herring.

Writing an RCA requires a lot of insight into the system and the need to get the details with the proper tool and correct interpretation. It should be invigorating details of the problem cause and the reader should get excited to know about the fact. Again, NO STORIES PLEASE. There are a plethora of tools inbuilt into Linux, which can assist a great deal. What I am trying to say here is, to get a minimal dependency on third-party tools. The more you are depending on it, the more process gets deviated from the fact. Yep, I know those tools can give you some eye candy thing, which will satisfy the needs of a “Black Suit” wearing guy; but that’s that, nothing more. You, practically get opaque information about the event, by the way, it does need some specific daemon to be run in the box with some arcane business license.

So, I would prefer to fuse in some tools, while building the server/building the AMI for future work. If it is production base/or public-facing, please for heaven’s sake DO NOT INSTALL the development library in it. I am just trying to close one more door to the bad guys. When a bad event happens we can capture the thing to the point.

I personally write RCA in plain text form (that is the best way I can describe the problem-solution capture). And the RCA should not be too long or filled with boring details but must have some pure technicalities attached to it. Most of the time, it should be restricted to three-paragraph, I believe that is good enough. O yeah, you have to pray hard that the set of people you send the thing should read and read thoroughly (because you have invested a lot of invaluable time in it to figure it out properly), barred those “Black Suite” wearing fellas, who only rely on the matrix. You should be ready to explain every detail that you capture out of the problem state if someone comes back to you with good intentions (you can figure that out very quickly..), so the more you understand the problem and the solution you are driving for, the better, and you can convey it to less technical people at ease. Yep, that is the thing you need to learn, practice, and deliver.

Now, you must have some pure technical documentation writer at your disposal. The RCA you write is not acceptable to the overall client and those “Black Suite” wearing fellows. The technical documentation person will take your text RCA and put it in the “Enterprise ready” format to send to those fellas. That is the norm of a corporate. The person should be well versed or should have enough bend of mind to understand what you did, and not try to tweak/alter/break, then put it in that format in “More” readable format for those “true lazy” fellas. You must have a session with the technical documentation expert after he/she formats your RCA in that specified format, just to check nothing got distorted or deviated, from what you wanted to deliver. A little bit of ITIL knowledge would not harm these activities on both sides. I am sorry if I sound pretty “Enterprise thing” in the above. I am solely thinking of BU infra management. But, in the open-source world, we could do it in much more varied ways.

Next, I shall discuss the importance of automation and software configuration management.

Linux Infrastructure Management Part-IV

Problem Management

This is the post that will make you aware of how problem-related with infrastructure can be handled and mitigated efficient way.

Oh, if you missed those previous parts of this series, here they are :

Part-One Part-Two and Part-Three .

I am the kind of person, who doesn’t believe in stories but believes in what is actually happening/happen thing. And strongly believe living with a bunch of people all the time gives more “black and white” pictures than distorted things with some colorful stories. It saves my time and theirs too.

So, in a broader sense, the problem comes in a bunch and goes away with a bunch, yep, the general rule of life. But, in BU, the problem can be categorized in different forms i.e. genuine problem, created problem, arse saving problem, incompetency guard problem et al. Frankly, I am not at all interested in others but the first one. Unfortunately, you have to deal with others, whether you like it or not, and need to take some really drastic steps to eliminate other kinds of “cosmetic” problems from your life and focus on the main problem.

Now for instance, I had been to the BU’s infra for some organization, so I can vouch, an emergency can come once a month or bi-monthly, and you can deal with it. But if it’s started to come more frequently than those intervals, then two things should come to your mind. One, the infrastructure needs a serious kind of lookup in a quick time but rigorous. Second, need to evaluate whether the guys who are operating or managing it. No other go I believe, you just can not sit on it for long. It might explode at some undesirable time and make you look like a fool. So, long story short, pro-activeness is highly desirable. We generally are good at being reactive.

Now, when you manage a team, you need to split up things as important and urgent, and everybody does something special. I would prefer to do the urgent or assign someone in the team, who I can believe accomplish the task without much fuss and the important thing has to be distributed with others along with proper eta. Sound familiar? Bound to be! That’s the way it works in most places. Plus, on top of that, I would preferably have someone to document the process if something new has come up. Being in charge of a team of any size, you need to understand whom to give what, pretty ordinary thing; easily figure able. By means, I need to convey that distributing the ownership between the members.

So, basically, I make people accountable for what they are doing and also make sure they get due credit for what they are doing. It will certainly boost their morale and in turn bring more productivity and a sense of responsibility. When you are bestowed with that kind of honor, chances will be high to gain more than lose.

Okay, a different matrix has to take into account solving the problem related to infrastructure. I will nail it down to a few, absolutely necessary ones. Just trying to eliminate noise from the actual fact. I have seen a bottleneck in problem-solving places, that too many levels of approval need for such a trivial and insignificant job. Sometimes it really irritates me that, a little bigger organization put so many layers which are not at all necessary for the system to work properly. And they create more noise, which is quite distracting. As I said above, a few good tools will make the infrastructure stable along with some sensible folks. If the human is not trained enough, it’s the duty of the leader to pour some good information into the human, so they do not have any illusion to work in that environment and become productive in a quick time. Although, I have seen most of the “Managers” fail to do that in an alarming way. They just do the wrong thing for the right reason.

I am pretty okay with a few bumps, that’s fine and that alerts you way ahead of time, so you can take precautions. But I am not sure how that can be measured, because every situation is different and the way to solve the problem would be quite different too (although the basics rely on the same, solving the problem). Educating team members is of ultimate importance. The more informed they are, the more they perform well.

In the next piece, I shall delve into the solution delivery aspect.

Linux Infrastructure Management Part-III

Information Management

“Information Management” means keeping the information in an easily accessible format. Many companies do it in different manners. There must be some sort of uniform way to manage information.

Oh BTW, in case you missed the first two parts of this series, here they are Part-One and Part-Two .

I am a huge fan of Wiki and install one if it is missing from the existing framework. It will keep the team on the same page as everyone. Wiki can be maintained in such a fashion that, it would be easy to read and interpret the information. And not to forget that during this crucial time, the information should be available with ease.

So, first thing first, installing Wiki is pretty easy compared to other things. I have done it, so I am going to share it with you. I like the way DokuWiki handles the information, so that would be my choice. It all depends on how many people are convenient with it, so we can opt for it. We need to think about its maintenance too. The more people know about it the better. You need to go here Dokuwiki and pull the download from here and follow the instruction on the same page on the left to install it and manage it.

Having homegrown Knowledgebase is extremely important for the people who delve into infrastructure. Because it helps to make things quickly resolvable. And there would be a standard place in the infra to look for the solutions.

I have the inclination to have something internal and necessarily not connected to the internet or need the internet to fetch the piece. That means, freely accessible within the concern and it is very common. Although, nowadays companies put lots of docs on the internet for public consumption, but certainly not this kind of operational stuff. And it might not be useful either, because everyone operates in a different way.

Once you installed the system in place then you are supposed to access the web UI and add pages according to the wiki style and keep it updated. Then, share the URL for everyone to read it, but allow or restrict only designated people to update it. Because it might play a huge role to maintain the infrastructure.

For instance, you might have solved a critical problem in the infrastructure, but you failed to document it in a proper manner then you will be in trouble( because similar kinds of problems come to infra management very often, and if the gap is more, then chances are high that you might forget that solution..irk), so not to befall in that trap, a better way to put that solution in the wiki, so in time it can get fetched from it to resolve the problem in quick time.

In another case, say, you might be doing something very important to the infrastructure, and probably with a team, and in the team, not everyone has got the same wavelength(you have to understand it), so it might get screwed up the quick time by somebody’s mistake. Then you might get back to the pristine state by the look at the road map for that particular project.

Do not trust anyone ..always validate ..where the information coming from and how they are coming and who is providing that! If you do; you have a safe bet. But still, we falter …to humans is an error.

Okay, we might take advantage of several open-source tools to keep things in place and available in time. Like, Trac and other customs have grown KB.

Next, we will discuss problem management related to infrastructure management.

Linux Infrastructure Management Part-II

Things To Consider

We are supposed to know the underlying infrastructure, I meant to say, how the servers are organized hardware-wise and what are the network connections to those boxes. It seems it’s a kinda mandatory thing to know about those to operate efficiently.

Indeed, it plays an important role to know where those boxes reside, for the cloud-related matter, it’s good to know which zone it is in.

We have nowadays so many tools available to tell us the information about the underlying infra. In older days we used to write a script to get the low-level details to automate things, now the scope little less as more and more smart people design and architect tools to take the pain away from ordinary mundane people like us. And it helps immensely to spend your time doing the actual thing, rather than wasting time to figure out where it is and how it works (yep to some extent; but still requires some investigation and time to get into the gory details); but that is the curiosity part. A sensible infra person is hugely curious and conscious about what he or she is doing.

I, personally refrain from doing anything if I am not at least a major part of it clear to me. Because we all know how bad the “half-knowledge” is. A premature assumption can lead you to nowhere and create havoc. Why get into that situation, when you are open to operating it and abound to provide some result. No, I do not gauge people by the result; it sometimes misleads you. The approach, that will make all the difference in an open system (read GNU/Linux and related systems) will allow us to navigate deep down. Provided you are willing to spend time with it. Okay, let me contradict it by saying that, working at BU doesn’t always bring you the luxury of time to get deep into it. I agree. But how about your personal time and knack for it?? Does it take away from the job you are bound to do? If that is the case, then think about it to get on with it.

We should have an understanding of how the servers are built from the ground up. I mean getting to know the hardware specs i.e how many CPU cores? and how much ram? How many NICs? What about the power supply? Which rack is seated and what label is on it? et al.

That information will help immensely. I had been in a situation, where I was asked to put the label on the iron box; for the cloud, you can tag it easily. Not only that but looking at the different light colors on the server panel gives you a hint about what is going on inside; although not necessarily all the time it will do. You got to take into consideration the specific event you are working on.

Important note, nowadays everyone depends on Google and other search engines; I too benefited from that. But I grew up in a stage and era where the internet was not easily available in my country. I had leaned on books (yes hardcover and paperback; I still prefer them!) and peer knowledge (you might be lucky enough to get a proper person to enhance your knowledge; a lot of factors involve it).

Take the thing to your head by yourself. I did. I do. And it takes time( because I don’t have a sharp bend of mind like you have), but once I get into it (which certainly has to interest me..at this age; I not anymore bound to do it situation); doing it for the sake of love and discovery.

I am not very impressed when I see people with airy-fairy ideas, talk big and nonsense. Because I have come across people who practice that to gain attention but false vanity. How dare I say nonsense? Because, they just don’t have any use case with them; putting ideas for the sake of doing it, nor do they have any hands-on thing on that; that irks me a lot.

When you are not so lucky to know everything, like working with a cloud operator, you have to take a different route to get things done. And I sometimes get bemused by the fact of the way things get done. Oversimplification is a curse in the cloud (or I have some mental block about it; need to figure out that). Lots of misnomers are floating out in the cloud place and it is very easy to get lost in it.

Why not take some time to read the spec and documentation about it then jump the gun and work on it? I had a stumbling block about it. I decided to give it a shot and read through a large chunk of it( only that concerns me; YMMV). And I believe I get a hold of a few portions of it solidly; thank god. I did.

I have a serious problem with me for a long time; I can easily figure out whom to approach and whom not and what to listen to and whatnot, most of the time and it is not necessarily I had have come out positive all the time, but most of the time I was benefited out of that approach. Now, how come I distinguish that person? Intuition and gut feeling (again YMMV). But having said that, I am not stuck with a particular method or rule( I hate it like anything), am pretty open to anything fruitful.

I like people who are on my face, and I like to be called an “idiot”; if that doesn’t make sense to them whatever I propose or do. Because, by being told idiot or similar kind of thing, two things come out; the person is really caring about you and wanted you to improve (I have seen some skeptics, and naysayers; they are just saying it out of their habit, and I can figure it out easily, and give them back what they deserve) and you will get chance to introspect about something you stopped thinking. It helps you to become much stronger and more efficient and more respectful to those who care. And it also does not allow you to live in an “illusionary bubble” anymore. That’s good.

When you maintain stuff at a “low” level. But not mean real low level, but mean at the hardware level, you are supposed to be good with the vendor. They can bail you out of a lot of hassle. I still remember opening up an IBM X series box and looking into it excite me a lot. I was a plain watcher; my expert colleague was doing all the stuff at that time.

Physical networking is of utmost importance to working in the data center. I generally keep a patch cable in my laptop bag most of the time. And I have seen big network experts with a lot of physical tools in the data center. Blinking the light on the ethernet port is crucial. Sometimes, it does blink but you failed to understand why the hell server is not responding to the network query; the real thing needs to check a few more places to confirm.

DO NOT REBOOT the machine for a trivial reason, because it takes servers to boot a little more time (sometimes really good time) to get it back, because lots of things get initialized while the server boots. Second, you can fix most of the network software-related stuff on the console and bounce the service. Yep, if you are having hard luck with the physical fault in the network, then chances are doomed for you. You need to get help from network experts. I mentioned above, a person with lots of physical tools.

OR if you still do; without understanding the impact ..please make sure your CV is updated and well circulated.

Everybody knows those facts, who has to spend enough in the corporate environment. I personally almost did that kind of mistake once; fortunately, my reporting boss helps me to prohibit that. When you have a rack full of servers and no label on it; that might cause lots of trouble, in my case, it was almost happening for that reason.

Never run any automation script without the prior permission of the person who is in charge of it. I did. And I was castigated by the people to ruin their job on the machine, heck. Even if you are good enough at something still it requires you to be on top of it and get the best out of it. I was not in that case. I made people’s life miserable. And the important thing, I took the lesson in a positive way. I wasn’t vigilant or informed enough to do such kind of thing in that environment. It’s not about running what you know; it’s all about how you run and why you run.

Cover your arse too! People get less chance to get on you. By saying that, whatever you do should have some checkpoints and mail related to that. And perhaps the doc related to that. In case of a question raised by some “black suit-wearing” person, you can readily refer to that.

Linux…Linux…Linux… All I have had care all throughout my endeavor and cared less about other stuff (purely because of lack of bend of mind and time to think about others, but that does not include open-source…). I believe thinking in a singular fashion sometimes helps you to achieve more than think a multi-dimensional way. At least it helps me be confined within one domain and helps me to grow. But you can stick to whatever interests you. In this book, I will solely focus on GNU/Linux; because that is the thing I am living with for a long time. I love it; I hate it; I embrace it; I proliferate it; I endowed it, the list can go on and on. Whatever I learn using it over a decade and was exposed to different environments doing different activities.

O BTW, managing an experience NOC team and DevOps demand a little bit more enhanced version of yourself. I have learned it hard way; yep indeed. Managing some egoistic humans is certainly not fun. Machines are good, they do what I want, but humans are blessed with EGO and that is predominant in most of us, from time to time came out. But, for some people it is always the way forward; heck, they seriously deserve a kick on their arse; sorry no other go.

Okay, “Your manager is always right”, is that so?? I don’t believe it. The only thing that separates you from your manager is exposure to more information about related matters. He/She might have gained it by some other means but still, he/she is ahead of you. Respect them on that account. And make sure you extract what you need from them. Most of them talk loads of rubbish; so put a filter on their verdict. They bring past events of their story into the present by forgetting that this is now different.

No, I am not saying disrespect them, give them the due they deserve. Moreover, who wants the story; give me your code I will figure it out myself; I don’t need your past story. Never say that to their face, react like that! So next time they will be cautious enough to take you on. Am I ranting against the managers?? A big NO. Reread the above paragraph again. I just point out the truth and what you should do. There are lots of good and I mean genuinely good people around, who is the manager, it’s just a tiny bit of luck you need to get bumped into them. But, alas! You will find the bad ones outnumbered the good ones; indeed. Keep your finger crossed for that and stop listening to stories.

Give yourself enough chance to fail. By acquiring more knowledge and work. Please make a mistake and learn from it. Get into a discussion on the IRC channel( to meet some rude guys) and in some forums (where most of the time half-cooked information is shared!! Except for two places, Gentoo forum, and Arch Linux forum), and I really like those places; people are so explicit and to the point for problem-solving and they expect people to come there to be explicit and clear. Am I biased? No, I am not.

Now, the more information you gather by any means (from your manager or by interacting with knowledgeable colleagues), the more chance you can get over quickly with the obstacle.

Read, read, read and practice ..practice, practice; no substitution for those. I do. I am not going to give a lecture on that. I learned it in that fashion. Investing in good books can benefit you in the long run. I personally have around close to 100 UNIX/Linux books on my shelf (at least went through them twice), not for the sake of collecting books and counting in the league but to explore and know more. Nowadays it’s become even more possible and easy.

We are in a field that is constantly changing and progressing. Moreover, it is cognitive science, so the more we are armed with knowledge, the more we can thrive. Now, there is a catch. Because of google, everybody becomes experts (I have come across a few; O Hell!), and a lot of information is not worth it. You need to identify which is required and which one needs to be discarded. At least I have limited space in my gray cell, so I discard a lot, keeping only those that will help me deduce something related I am doing /will be doing very soon. It is certainly not an easy job, doing so needs some sort of concentration, like the way we configure software in the servers. One silly mistake and you are in a position to miss the information for good. Sometime it might be costly to miss those. And we do miss those. After all, we are human. To humans is err.

Cloud…OMG cloud!!!… did I mention that I struggled with it initially? Yes, I had a torrid time with the terminology used by the expert cloud infra folks. Okay, somehow now I can get hold of it …although not completely. Cloud has a magnificent upside and equally has a wonderful downside too.

It will take away the overhead of maintaining the physical data center and related stuff, like cooling, personnel et al. And it can spin up a server in a very quick time, so the downtime goes for a toss. the cons, you lose little control over the hardware and networking stuff, which is sometimes not good. You will be dependent on others to provide you with the underlying infra. One predominant misconception in people’s heads is that the cloud is cheap. NO, it is not. Period. You got to pay for every little thing, which might accumulate and exceed your budget long run.

There are lots of players for the cloud in the market, a few of the renowned ones are OpenStack, cloudstack , Eucalyptus, and OpenNebula.They offer services according to their strength. But, all of them are basically good. Do not forget AWS, they are the front-runner in that space. Most of them support the AWS API to get interoperability.

Getting your hands dirty with it (more on that later) will certainly help you to excel in this field. I don’t know I always prefer the CLI way of doing things, probably it stuck with me from the beginning. If and only if necessary then only I can fall back on GUI. O BTW you can use ncurses-based UI on the terminal itself. And there are many tools that are available to do the required job. In fact, renowned distributions supply the tui version of GUI, which is a good part. Because we will be working most of the time in a headless environment (headless in server term, no X11 or GUI related stuff installed, security measure). So, get yourself accustomed to the CLI, it certainly helps; moreover, it is much faster than GUI (when invoked, it will bring along a plethora of things along with it, in turn, more time to get work ). Try running a GUI app from the terminal, you can only see what is going on behind the scene. When you operate on a server, you just can not afford that clutter your terminal, if you suppress that too, why bother invoking that, when you have an alternative available, which is much more lightweight.

Now, when you choose your OS, give it a thought, are you going for 32bit (almost nobody using it now) and opt for 64bit. The architecture is almost similar to a little tweak. And more space in the address bus and data bus. Calling the system call return faster. There is no visual difference between 32bit and 64bit apart from the naming convention to the /lib directory. But, internally it might play a huge, as I mentioned briefly above. Moreover, you are giving yourself more chances to embrace with current proceeding on the hardware front, which in turn helps the server to take advantage of the underlying technological advancement.

And what more? I think I have given enough details above for the heads-up. In the next article, I will discuss information management.

Linux Infrastructure Management Part -I

Basic Understanding Of Infrastructure

Understanding the environment for work is an utmost need for any person to work in the infrastructure domain.

For instance, whether it is a normal physical infra or based on the cloud. There is a need for certain mind shifts for working in two different environments.

The traditional Data Center environment is drastically different than the “Virtual Data Center” environment. Things are different.

I have been to both places. Operating is an undertaking in both places.*Co-lo* infrastructure was the dominant factor before the cloud engulfed them.

In the physical data center, you need to interact with iron boxes, watch out for the cables, configure the rack console and check for the lights on the front and back of the server and take some decisions based on that!!.

While in the virtual data center, you are not so fortunate about seeing those, but the software makes it simulated for those missing parts, sometimes it missed and sometimes it hits, depending on how the software is configured and which software it is (there is a plethora of that kind of software available in the market, open-source and close-source, both).

Because virtualization arrived late (although it was there for a long time and people ignored it and hardware support was minimal, so never get kick-off) and things have changed, people get more aware of their capabilities, and started to take advantage of it.

“Cost-cutting” is mandatory in any IT organization’s portfolio these days and they reap the benefits of it, by opting for virtualization. But, still, some important and mission-critical thing runs in a physical place. And I love it; nothing like seeing an iron box in front of you ..so you can view it with your x-ray eyes and touch it.

Sometimes I feel (absolutely my personal opinion), that the virtual environment is a little more fragile than the physical one; probably I came from a normal DC environment to a VDC environment that is why. I didn’t mean to say it is inferior or bad, but it has many more ways to fail than the physical environment.

Anyway, being in infra management, you have to be well conversant with both the environment. No other go, simple. Period.

Yes, one thing is very clear you are bound to be leaner on software to manage the data center because the underlying part i.e hardware and other kinds of stuff are taken care of by the provider you are opting for i. e Amazon, Rackspace, et al.

I had a struggle (and I am not ashamed of it, because it helps me to learn what ware not on my radar), and while moving to the cloud I had a horrid time, indeed. I still (sometimes) failed to understand the terminology people are trying to describe something. Let me give you an instance, where I got confused.

I get into an environment where the cloud is a “mantra” and a team working exclusively on clouds and I keep hearing from them many times a day the word HA for building the project they are working on. I was excited; building HA for anything is very exciting, indeed. One day I was overly curious and asked one of them; what is this fuss all about HA? Where is heartbeat (the software) and how it has been configured? But boy!, what I came to know is that they are simply building those in two different places and using the feature of AWS auto-scaling.

Kinda disappointed solely because my conventional mind jumps into building a conventional way of building HA, heck…

I had the good fortune to work for a pretty big and good-name IT giant for some time and I saw it all. I still remember the first day; I went into DC with my manager, and I was awe-struck to see thousand of physical boxes in different shapes (one of the shapes dominating although) and sizes.. oh boy! And the sound and the chillness of the place!!

Now the downside. I was running some automation scripts which bother some users; they complained to me “what is your problem?” I was a bit weary at that time. Also, it was my fault for not taking consent from the person, who looks after all that, taking the rein on my hand without having all the information is not a good thing on my part. Anyway, learned the lesson more quicker.

Working 24/7 and 365 days schedules mar lot of your plan and commitment, but one has to do. Once taken responsibility, you just can not bypass it, because a lot of things might depend on your action. Basically, I meant to say, be prepared for the heads up very frequently and repetitively way and pull your socks all the time; no respite.

General term: try to segregate the important thing from the masses. You can get hold of them in times of urgency. Like, you should have information about; where the DNS servers are located and where the NIS servers are located. You can add a few more stuff which is absolutely critical for operation.

Last but not the least, but extremely important, you got to work with a bunch of people like you; so try to mingle with them genuinely and make them feel happy about you. Because they will be the ones who will save your ass when you will be in distress.

Next article, I will discuss what components are important to consider for infrastructure.

Information Collection And Process To Keep Abreast

Well, it is important to have the relevant information at your disposal in a timely manner and get the essence out of it to utilize in the best possible way to enhance your own knowledge and disseminate the correct one to others. While doing so, people generally sometimes missed the important stuff from the consumed content and, in essence, missed the purpose of it. I have been doing/trying to disseminate the information the correct way for a long long time, while doing that, I throw away lots of noisy stuff from the information pool I look at for the information.

Now, the proliferation of the internet allows people to get or look in the nook and corners,which were difficult to reach or probably restricted to reach to some extent, now can be reached so very easily. But ….but be aware that there is always noise to signal ratio that is pretty high, so you have to be judicious enough to embrace what is needed and discard what is irrelevant. Having said all this, it is not so easy thing to do without the help of some sort of tools(read as software) to achieve or narrowed down the stuff you are interested in or love to process to gain more knowledge.

RSS feeds are one damn good mechanism to get the interesting stuff at your disposal with minimal fuss. A mailing list for some projects, especially technically inclined ones. But, there are other ways to get information delivered to you, and that can be en masse. So, filtering out the required information needs some careful observation to deduce what to look for and what not to look out for.

And these mechanisms are available for some years decades now. So, having something at your disposal can be very beneficial, provided you know how to use those tools properly. Plus, it is a kind of training for the mind, to get the pattern it is providing or operating on. The sooner you capture the theme, the better it will be. Plus looking at the sources (yes, you can find out that pretty easily these days) also determined whether you want to get the information from the source or not. It is not about rudeness but about the limited capacity you have between the ears to hold on to information and process. So, to restrict to some sort of periphery to not get swayed by unneeded stuff.

I do use Newsboat RSS Feeder and Elfeed as my prime news feeder. The news items are delivered in mostly text form. But, it might contain links to other resources to consume. And for the technical stuff, there is a mailing list to aid the information.*Email* is one of the oldest forms of information exchange mechanism and I still do bend on it, most of the time. 🙂

Here is what those specific tools with information flow look like :

Newsboat

Elfeed

Mailing list of Linux Kernel

OR

Linux Kernel Mailing List Archive

These are not fancy tools that produce eye candy stuff but are well-decorated pieces of software that deliver what it is supposed to build for. Mostly text, but can be configured to show fancy graphical things. Tools are important and it is ever more important to know about the tool, which you are supposed to use on a long-term and daily basis. The more you know about the tool, which you are using day to day to get your information, the better you can manipulate them to filter it tight to get only the absolute required stuff. Hence, mitigate the distraction.

Sometimes, a flurry of information stream might boggle you, and might fall for some click-bait stuff which leads you to consume red herring information. While on the internet, one has to be extremely cautious about what to consume and, importantly from where. This decisiveness of having stuff only from these places comes from being judicious.

It is extremely important to get connected with people who can only feed you with Black and White stuff, in other words, the truth without much fuss It saves time for both parties. Stories don’t get well with my system nor I can stand babbling too much. My eardrums get shut when I sense if fact I might present right in front of the babbler. Sounds rude?? Can’t help much regarding that, who the f*** wants to hear that?? The habit of listening is considered invaluable, but you need to make sure, what you listen to and whom you listen to, and how much you listen, those parameters decide when my eardrum gets shut.

Awareness is something, which could be developed over the years of a person’s lifetime and is applicable to every damn human being out there. I don’t believe nor do I want to convince myself that someone out there developed it to the end, YMMV.

Reading can greatly benefit one, provided they know what needs to be read. Reading for the sake of reading doesn’t make much sense, moreover, it crowded your mind with information that might be completely useless for your understanding or use case in life. We all have limited time on this earth, why bother investing in something which only fills in the garbage. Plus, it will consume your invaluable time of life to whatever you are reading. Because, if you are not able to practice, whatever you have read/read or get a chance to do so, then it might become a burden to carry that essence with you. I am not rooting for getting immediate benefit but the essence should or must enhance the experience of life being lived properly by doing something meaningful.

Alright, I have had to inculcate a habit of reading and hence writing for the sake of my enhancement. It is a conscious effort to get better and help others to get better at it by NOT annoying or poking. (I have seen people, who are extremely inclined to bestow their understanding on others by some frivolous act) …not good though. Stirring the interest of others, regarding anything needs lots of background checks and an understanding of the targeted person’s inclination. And it is certainly not a trivial act.

But today’s pseudo-busy world, where people are having an unprecedented inclination towards getting or achieving everything in very short or less required time force people to do so many stuff, which they are not supposed to do in the first place. Anyway, to make sure you do not fall into a trap by looking at the glitz of some furious act or words, a cautious and mindful approach can help. Again, it developed by when people watch/practice/act/think in a similar manner. That essentially means the thought process should synchronize with the act in mind.

The modern world is flooded with orators. Alas! there is and was always a gulf between the effective communicator with the previous. If a person is not able to communicate effectively a way for long, then the chances are pretty bleak to go forward in any matter. It is a consistent act of doing something very basic, but very important. On the other hand,/orators/ are full of babbling( I am not denying that they get a chunk of their share and sometimes the meat of it these days), but those are not so-lasting stuff. It will wipe out quicker than might have thought. I can sense that form, probably listing to garbage over again and again making my system recognize that kind of thing almost instantly. And is probably the case with other people, too.

Information processing is as important as consuming it. Until you process the consumed information in the correct way, you are not in a position to deal with it or share it in a meaningful way. So, people do not do it right all the time, I certainly do not do it all the time, but I am on a quest of improving upon it. In other words, the quest is on 🙂 Learning from sources you trust and rely on can give you some sort of assurance, as it did to me. I am and was very very selective about something that matters, as I mentioned it early, some people tag it as inhibition, how poor that can be? Extremely poor. As human beings, we have the right to select what we want and importantly, how we want. Period. But, when that conflict with others’ preconceived dogma then people judge than evaluate. Again, not good enough.

To reinstate your understanding(or what I do), I do make sure that I should not deviate from the topic of things I am trying to convey or share with others. While you are at it, sharing something on that should be considered good if you add something more meaningful to the already existing context without reiterating the matter in some tongue-twisted wards. That will certainly not be appreciated by sensible people.

Why bother with the proper information exchange? Because, it is a vicious loop and if you haven’t done so, it might come back and bite your arse someday. It is some sort of untold responsibility to take ownership of doing right for the sake of doing it well for yourself.

Flame me with your thought.