Bash Scripting Challenge - Log Analyzer and Report Generator

Bash Scripting Challenge - Log Analyzer and Report Generator

Scenario

You are a system administrator responsible for managing a network of servers. Every day, a log file is generated on each server containing important system events and error messages. As part of your daily tasks, you need to analyze these log files, identify specific events, and generate a summary report.

Log Analyzer:

A log analyzer in Bash scripting can be developed to parse and analyze log files, extract relevant information, and generate reports or perform specific actions based on the log data. Here's a basic example of how you can create a simple log analyzer using Bash scripting:

#!/bin/bash

logfile="access.log"  # Path to the log file you want to analyze

# Function to analyze the log file
analyze_log() {
    echo "Analyzing log file: $logfile"

    # Count the number of occurrences of each IP address in the log file
    awk '{print $1}' "$logfile" | sort | uniq -c | sort -rn

    # Add additional analysis or processing as per your requirements
}

# Check if the log file exists
if [ -f "$logfile" ]; then
    analyze_log
else
    echo "Log file not found: $logfile"
fi

In this example, the script assumes the log file is named access.log, The analyze_log function uses awk to extract the IP addresses from each line of the log file, sort them, and count the occurrences using uniq -c, and sorts them in descending order.

Certainly! Here are some additional features and enhancements you can consider when developing a log analyzer in Bash scripting:

  1. Filtering and Searching: Add the ability to filter and search for specific log entries based on different criteria such as timestamps, IP addresses, error codes, or keywords. This can help pinpoint specific events or patterns in the log file.

  2. Statistical Analysis: Calculate and display statistical information about the log data, such as the most frequent IP addresses, top URLs, average response times, or error rate. You can use commands like awk, grep, or sed to extract relevant data and perform calculations.

  3. Error and Alert Notifications: Implement a mechanism to detect specific error conditions or patterns in the log file and send notifications or alerts. For example, you can check for excessive failed login attempts, server errors, or security-related events and send notifications via email or other messaging services.

  4. Data Visualization: Generate visual representations of log data using tools like gnuplot or matplotlib. You can create charts, graphs, or histograms to better understand and present patterns or trends in the log file.

  5. Log Rotation and Archiving: Handle log rotation and archiving to ensure the analyzer can process both current and archived log files. You can automate the process of detecting and handling rotated log files by checking for new files or monitoring log file changes.

Task1:

Write a Bash script that automates the process of analyzing log files and generating a daily summary report. The script should perform the following steps:

  1. Input: The script should take the path to the log file as a command-line argument.

      #!/bin/bash
    
      echo "Log Analyzer & Report Generator"
    
      # Check if a log file path is provided as a command-line argument
      if [ $# -ne 1 ]; then
          echo "Usage: $0 <path_to_log_file>"
          exit 1
      fi
    
      log_file="$1"
    
      # Check if the log file exists
      if [ ! -f "$log_file" ]; then
          echo "Error: Log file '$log_file' not found."
          exit 1
      fi
    
    1. Error Count: Analyze the log file and count the number of error messages. An error message can be identified by a specific keyword (e.g., "ERROR" or "Failed"). Print the total error count.

        # Count and print the total error count based on keywords using grep
        count_errors=$(grep -ciE "Error|failed" "$log_file")
        echo -e "\nNumber of errors found in log file: $count_errors"
      
      1. Critical Events: Search for lines containing the keyword "CRITICAL" and print those lines along with the line number.

          # Printing critical errors 
          critical_error=$(grep -n "CRITICAL" "$log_file")
          echo -e "\nThese are the lines containing the keyword 'CRITICAL':"
          echo -e "\n$critical_error"
        
        1. Error Messages: Identify the top 5 most common error messages and display them along with their occurrence count.

            # Identify and display top 5 most common error messages with occurrence count
            top_errors=$(grep -iE "error|failed" "$log_file" | sort | uniq -c | sort -nr | head -n 5 | awk '{$1=$1; for(i=1;i<=NF;i++) { printf "%-2s ", $i } printf "\n"}')
            # Display the results with left-aligned columns
            echo -e "\nTop 5 most common error messages:"
            echo "___________________________________"
            echo "$top_errors"
          
        2. Summary Report: Generate a summary report in a separate text file. The report should include:

          • Date of analysis

          • Log file name

          • Total lines processed

          • Total error count

          • Top 5 error messages with their occurrence count

          • List of critical events with line numbers

                # Generate Summary Report
                summary_file="summary_report.txt"
            
                {
                    echo "Date of analysis: $(date)"
                    echo "Log file name: $log_file"
                    echo "Total lines processed: $(wc -l < "$log_file")"
                    echo "Total error count: $count_errors"
                    echo -e "\nTop 5 error messages with occurrence count:"
                    echo "$top_errors"
                    echo -e "\nList of critical events (with line numbers):"
                    echo "$critical_error"
                } > "$summary_file"
            
                echo -e "\nSummary report generated: $summary_file"
            
        3. Optional Enhancement: Add a feature to automatically archive or move processed log files to a designated directory after analysis.

            # Ask the user if they want to archive the log file
            read -p "Do you want to archive the log file? (y/n): " choice
            if [ "$choice" == "y" ] || [ "$choice" == "Y" ]; then
                read -p "Enter the destination directory for archiving: " destination_dir
                if [ ! -d "$destination_dir" ]; then
                    echo "Creating destination directory: $destination_dir"
                    mkdir -p "$destination_dir"
                fi
          
                archive_name=$(basename "$log_file").$(date +"%Y%m%d%H%M%S").tar.gz
                tar -czvf "$destination_dir/$archive_name" "$log_file"
                echo -e "\nLog file archived to: $destination_dir/$archive_name"
            else
                echo "You chose not to archive the log file."
                echo "Thank you for using our script."
            fi
          

          FINAL SOLUTION:

            #!/bin/bash
          
            echo "Log Analyzer & Report Generator"
          
            # Check if a log file path is provided as a command-line argument
            if [ $# -ne 1 ]; then
                echo "Usage: $0 <path_to_log_file>"
                exit 1
            fi
          
            log_file="$1"
          
            # Check if the log file exists
            if [ ! -f "$log_file" ]; then
                echo "Error: Log file '$log_file' not found."
                exit 1
            fi
          
            # Count and print the total error count based on keywords using grep
            count_errors=$(grep -ciE "Error|failed" "$log_file")
            echo -e "\nNumber of errors found in log file: $count_errors"
          
            # Printing critical errors 
            critical_error=$(grep -n "CRITICAL" "$log_file")
            echo -e "\nThese are the lines containing the keyword 'CRITICAL':"
            echo -e "\n$critical_error"
          
            # Identify and display top 5 most common error messages with occurrence count
            top_errors=$(grep -iE "error|failed" "$log_file" | sort | uniq -c | sort -nr | head -n 5 | awk '{$1=$1; for(i=1;i<=NF;i++) { printf "%-2s ", $i } printf "\n"}')
            # Display the results with left-aligned columns
            echo -e "\nTop 5 most common error messages:"
            echo "___________________________________"
            echo "$top_errors"
          
            # Generate Summary Report
            summary_file="summary_report.txt"
          
            {
                echo "Date of analysis: $(date)"
                echo "Log file name: $log_file"
                echo "Total lines processed: $(wc -l < "$log_file")"
                echo "Total error count: $count_errors"
                echo -e "\nTop 5 error messages with occurrence count:"
                echo "$top_errors"
                echo -e "\nList of critical events (with line numbers):"
                echo "$critical_error"
            } > "$summary_file"
          
            echo -e "\nSummary report generated: $summary_file"
          
            # Ask the user if they want to archive the log file
            read -p "Do you want to archive the log file? (y/n): " choice
            if [ "$choice" == "y" ] || [ "$choice" == "Y" ]; then
                read -p "Enter the destination directory for archiving: " destination_dir
                if [ ! -d "$destination_dir" ]; then
                    echo "Creating destination directory: $destination_dir"
                    mkdir -p "$destination_dir"
                fi
          
                archive_name=$(basename "$log_file").$(date +"%Y%m%d%H%M%S").tar.gz
                tar -czvf "$destination_dir/$archive_name" "$log_file"
                echo -e "\nLog file archived to: $destination_dir/$archive_name"
            else
                echo "You chose not to archive the log file."
                echo "Thank you for using our script."
            fi
          
            #!/bin/bash
          
            echo "Log Analyzer & Report Generator"
          
            # Check if a log file path is provided as a command-line argument
            if [ $# -ne 1 ]; then
                echo "Usage: $0 <path_to_log_file>"
                exit 1
            fi
          
            log_file="$1"
          
            # Check if the log file exists
            if [ ! -f "$log_file" ]; then
                echo "Error: Log file '$log_file' not found."
                exit 1
            fi
          
            # Count and print the total error count based on keywords using grep
            count_errors=$(grep -ciE "Error|failed" "$log_file")
            echo -e "\nNumber of errors found in log file: $count_errors"
          
            # Printing critical errors 
            critical_error=$(grep -n "CRITICAL" "$log_file")
            echo -e "\nThese are the lines containing the keyword 'CRITICAL':"
            echo -e "\n$critical_error"
          
            # Identify and display top 5 most common error messages with occurrence count
            top_errors=$(grep -iE "error|failed" "$log_file" | sort | uniq -c | sort -nr | head -n 5 | awk '{$1=$1; for(i=1;i<=NF;i++) { printf "%-2s ", $i } printf "\n"}')
            # Display the results with left-aligned columns
            echo -e "\nTop 5 most common error messages:"
            echo "___________________________________"
            echo "$top_errors"
          
            # Generate Summary Report
            summary_file="summary_report.txt"
          
            {
                echo "Date of analysis: $(date)"
                echo "Log file name: $log_file"
                echo "Total lines processed: $(wc -l < "$log_file")"
                echo "Total error count: $count_errors"
                echo -e "\nTop 5 error messages with occurrence count:"
                echo "$top_errors"
                echo -e "\nList of critical events (with line numbers):"
                echo "$critical_error"
            } > "$summary_file"
          
            echo -e "\nSummary report generated: $summary_file"
          
            # Ask the user if they want to archive the log file
            read -p "Do you want to archive the log file? (y/n): " choice
            if [ "$choice" == "y" ] || [ "$choice" == "Y" ]; then
                read -p "Enter the destination directory for archiving: " destination_dir
                if [ ! -d "$destination_dir" ]; then
                    echo "Creating destination directory: $destination_dir"
                    mkdir -p "$destination_dir"
                fi
          
                archive_name=$(basename "$log_file").$(date +"%Y%m%d%H%M%S").tar.gz
                tar -czvf "$destination_dir/$archive_name" "$log_file"
                echo -e "\nLog file archived to: $destination_dir/$archive_name"
            else
                echo "You chose not to archive the log file."
                echo "Thank you for using our script."
            fi
          

you can enhance the script by adding options for customizing the report output format, specifying different data files or log files, or providing additional parameters for the report generation process.

You can utilize command-line arguments or a configuration file to accomplish this.

Testing the script with various data scenarios and handling error conditions will help ensure its reliability and accuracy as a report generator.