Troubleshooting Theory and Techniques

From BrycesWiki
Jump to navigationJump to search

The following are notes taken from CompTIA A+ Certification Study Guide Seventh Edition by Jane Holcombe and Charles Holcombe, which covers the CompTIA A+ 2009 objectives.

Details

  • troubleshooting is the act of discovering the cause of a problem and correcting it
    • requires:
      • patience
      • instincts
      • experience
      • methodical approach

Preparation

  • when faced with a computer-related problem, resist urge to jump right in and apply favorite all-purpose solution
    • instead, take time to:
      • verify that recent set of backups of user data exists
        • if not, perform data backup, at minimum
          • back up entire system
  • assess a problem systematically, using a strategy of dividing large problems into smaller pieces for individual analysis, and apply troubleshooting theory
  • be prepared to question the obvious
    • do not even assume computer is plugged in
      • powered up
    • all peripherals are securely connected
  • always have a pad and pencil, or a PDA, with which to note your
    • actions
    • findings
    • outcomes
  • info provided will be critical to documentation created at end of entire process

Troubleshooting Theory

  • general troubleshooting theory includes the following procedures:
    • identify problem
    • establish a theory of probable cause
    • test theory of probable cause
    • establish an action plan to resolve problem, and them implement solution
    • verify full system functionality and, if applicable, implement preventive measures
    • document findings, actions, and outcomes

Identify the Problem

  • even when problem seems obvious, gather as much info as possible about
    • computer and peripherals
    • applications
    • OS
    • history
    • helps clearly identify problem

Examine the Environment

  • ideally, one will be able to go onsite and see the computer patient in its working environment so info can be gathered by observation
  • once onsite, one may notice a situation that contributed to problem or could cause other problems
  • if you cannot go onsite, one may be able to diagnose and correct software problems remotely, using Remote Assistance or Remote Desktop
    • otherwise dependent on user observation
  • whether onsite or remote, you are looking for cause of problem, which is often a result of some change, either in environment or computer directly
  • when troubleshooting system that is not functioning properly, make it practice to always perform a visual inspection of all cables and connectors, making sure all connections are proper before investing any time in troubleshooting

Question the User: What has happened?

  • best source for learning what happened leading up to a problem is the user when problem occurred
  • first question to user should be, "What happened?"
    • prompts user to discuss problem
  • ask for particular details of events leading up to failure, and symptoms user experienced
  • Do other devices work?
    • helps isolate problem
      • if one or more other devices don't work, one is dealing with a more serious, device independent problem
  • ask about device's history
    • did device ever work?
      • if user says it is a newly installed device, one has a very different task ahead than if user says it worked fine until now
        • former indicates a flawed installation
        • latter points to possible device failure
  • if user mentions an error message, ask for as much detail about the error as possible
    • if user cannot remember, try to recreate the problem
    • ask if error message is new or old, and if computer's behavior changed after the error
      • computer might issue warning that simply informs user of some condition
    • if error code points to device, such as optical drive, ask device-related questions
  • the Event logs in Windows save many error messages, so if the user cannot remember error messages, check the Event logs
  • sometimes customers are reluctant to give all details of problem because of fear of embarrassment or being held responsible. treat customer in respectful manner that encourages trust and openness about what may have occurred

Question the User: What has Changed?

  • find out about any recent changes to computer or surroundings
    • ask if new application or component has recently been installed
      • if so, has computer worked at all since new installation?
        • answer could lead to important info about application or device conflicts
          • if user says audio has not worked since a particular game was loaded, two events are most likely related
            • when troubleshooting, remove the software or other upgrades installed shortly before problem occurred

Establish a Theory of Probable Cause

  • when symptoms of problem have been determined, try to replicate problem and begin an analysis, from which one will develop theory of probable cause
    • if user says printer will not work, have them send another print job to printer
      • watch closely as user performs task
      • take note of any error messages or unusual activity that they may not have noticed
      • observation will show process from beginning to end
        • looking over user's shoulder gives you a different perspective, and mistakes, such as an incorrect printer selection or the absence of an entry in the Number of Pages field, that caused the problem

Vender Documentation

  • while working to pinpoint source of problem, check out any vendor documentation for software and hardware associated with problem
    • hard copy or info on vendor website

Hardware or Software

  • from observations and info gathered from user, try to pinpoint the cause of problem
    • it may be obvious that a device itself will not power up
  • if source of problem has yet to be found, search needs to be narrowed down further by determining whether problem is hardware or software related
  • hardware problem is considered to include the device, as well as device drivers and configuration
  • software problem considered to include applications, OSes, and utilities
  • one of the quickest ways to determine if hardware or software is at fault is to use Windows Device Manager
    • indicates any conflicting or unknown devices
    • just because Device Manager offers no infobabout problem does not mean it is not hardware related
      • it only means Windows has not recognized it
    • right-click Computer icon and select properties
    • in the system dialog box, select Device Manager task
    • in Device Manager's window, devices on computer will be displayed and organized under hardware type, such as:
      • Computer
      • Disk Drives
      • Display Adapters
      • DVD/CD-ROM Drives
      • Human Interface Devices
    • if Windows detects a problem with a device, it expands the device type to show the devices
      • in case of a device with a configuration problem, exclamation mark will be present on both type icon and device icon
      • When Windows recognizes a device but does not understand its type, device will be listed as Other Devices with a question mark

Probable Causes

  • from observation and research, compile list of probable causes and if any of them has a simple solution, apply those first
    • if problem is not resolved, investigate other items on list

Test the Theory to Determine Actual Cause

  • after establishing theory of probable cause, test theory
    • may need test system, isolated from rest of network, or, if that is not an option, test theory on problem system
      • make sure method of test does not endanger user's data or productivity
  • Does solution extend beyond a single use's desktop and beyond scope of responsibility?
    • if so, problem must be escalated to another department, such as network administration
  • Once theory is tested and found successful, one can proceed to next step

Establish an Action Plan to resolve the problem and Implement the Solution

  • after successful testing of theory of a probable cause, one can move on to the planning stage
  • think through both actions necessary and possible consequences of those actions
  • involve people from all areas affected by the problem and by the effects of the solution
    • Business areas can include:
      • accounting
      • billing
      • manufacturing
      • sales
      • customer service
  • also check with all IT support areas that must take part in solution
  • plan should include:
    • steps to take
    • order in which steps should be done
    • all testing and follow-up needed
      • includes steps required to minimize any possible bad effects

Verify Full System Functionality and Implement Preventive Measures

  • whether problem and solution involve a single computer or an entire enterprise, one must always verify full system functionality
    • if dealing with a single system, once solution has been applied, restart system and device
  • if solution has negative affect on anything, take additional steps to correct problem
    • may be put back in troubleshooting loop
  • Once solution is tested successfully, have user verify and confirm that problem is solved
    • should begin exactly as the user's work day begins:
      • user restarting computer and/or log on
      • opening each application required in a typical day
      • use all peripherals, such as printers
      • have user confirm everything works
  • once one touches a user's computer, even though problem is solved, user will associate them with next thing that goes wrong
    • result in receiving a call from "such and such", who has not worked with them since they were there, even though "such and such" does not relate to any changes one made
  • once everything is working normally and both user and tech tested for full system functionality, have user sign off on it to document the satisfying results. If this last is not an accepted procedure in organization, suggest it because it adds commitment to both sides of transaction
    • one is committed to test and confirm a successful solution and user is committed to acknowledge that solution worked

Document Findings, Actions, and Outcomes

  • document all findings, actions, and outcomes
  • take notes as one works
  • once problem is resolved, review notes and add any omissions
  • sit down with user and review what one did
    • statement to user that certain changes were made
      • be clear that no other changes to system were made
  • afterwards, these notes, formal or informal, such as comments entered into a help desk database, will be useful when encountering the same or similar problems
    • good idea to incorperate some of the lessons learned during troubleshooting into training for both end users and support personnel

Training

  • well-trained personnel are best defnse against problems
  • important troubleshooting technique is ongoing training for both end users and support personnel
    • delivery methods and training materials should suit environment, as many options are available in most OSes and applications, user manuals, installation manuals, and Internet or intranet resources
  • all personnel involved should know how to access any training resources available
  • end users can often solve their own problems by checking out help program or accessing an online training module, cutting down on number of service calls and associated loss of productivity