CS3502/project2/report/report.tex

\documentclass{article}

\usepackage{xcolor}
\usepackage{minted}
\usepackage{hyperref}

\title{CS3502 Project 2: CPU Scheduling Simulator}
\author{Kiana Sheibani}

\begin{document}
\maketitle
\newpage

\section{Introduction}

CPU scheduling, the allotment of processor execution time to running processes, is an integral part of any modern computer. There are many different strategies for doing so, designed to optimize different metrics:

\begin{enumerate}
  \item \emph{Waiting time}, the amount of time between a process being ready and the CPU beginning to execute it;
  \item \emph{Turnaround time}, the amount of time between a process being ready and finishing;
  \item \emph{CPU utilization}, the percentage of CPU cycles spent executing a process;
  \item \emph{Throughput}, the rate at which processes are finished executing.
\end{enumerate}

CPU scheduling strategies include FCFS (First Come First Serve), Round Robin, priority scheduling, and more. To determine which of these methods is most effective, it is useful to construct test models of CPUs in order to assess them in practice.

\section{Methodology}

To run these tests, I wrote a program in C to simulate different scheduling strategies based on process statistics input by the user. The code for this program is stored in my repository for this course under the \texttt{project2} subdirectory: \url{https://git.tokinanpa.dev/toki/CS3502}.

The simulator supports seven different strategies:

\begin{itemize}
  \item \texttt{sjf} (Shortest Job First) --- Processes are run in ascending order of time to execute (\emph{burst time}). This is theoretically ideal, but impossible in practice, as it is impossible to know how long a program will take to execute.
  \item \texttt{fcfs} (First Come First Serve) --- Processes are run in the order that they arrived in the ready queue. The simplest and least sophisticated strategy.
  \item \texttt{roundrobin} (Round Robin) --- The currently running processes are cycled between at a fixed interval.
  \item \texttt{priority} (Priority Queue) --- Each process is assigned a priority, where higher priority processes are run first over lower priority ones.
  \item \texttt{priorityaging} (Priority Queue with Aging) --- Similar to \texttt{priority}, but each running process has its priority increased over time to prevent starvation, an issue where a low priority process is never run due to more important processes running first.
  \item \texttt{lottery} (Lottery) --- The next process to run is chosen at random from the ready queue.
  \item \texttt{srtf} (Shortest Remaining Time First) --- The most complex algorithm supported. This strategy is similar to \texttt{sjf}, but the burst time of each process is approximated based on the burst times of the previous processes, and the currently running process is always that which takes the least remaining time to complete.
\end{itemize}

The simulator also supports a mode to compare the performance of each strategy on the same dataset.

\subsection{Technical Details}

The processes to be scheduled are passed to the scheduler as an array of structs, where each struct contains the process's properties: a process ID, its arrival time, its burst length, and a priority value if the scheduler requires one. The struct also contains fields for the scheduler's output, specifically the intervals of time in which the CPU is running the process.

\begin{listing}[!ht]
\inputminted[
firstline=11, lastline=28,
fontsize=\footnotesize,
linenos,
frame=lines,
framesep=2mm,
baselinestretch=1.2,
]{c}{../src/main.h}
\vspace*{-\baselineskip}
\caption{\texttt{main.h} --- Definition of \texttt{process}}
\end{listing}

The \texttt{cpu\_time\_t} type is an alias defined to be an integer type, specifically \texttt{long}. This type is used as the time unit for specifying both timestamps and durations.

In order to better facilitate the execution of the scheduling algorithms, the list of processes must be sorted by arrival time. This is enforced by the program when the arrival times are input.

\section{Performance}

Unfortunately, due to various bugs and other issues with the program that I could not resolve in time, I was not able to obtain as much testing data as I hoped on the performance of each scheduler. From what data I was able to obtain, however, the \texttt{srtf} scheduler performed consistently better in both waiting time and turnaround time than all other scheduler strategies (discounting the impossible \texttt{sjf} scheduler).

\begin{table}[!ht]
\centering
\begin{tabular}{|c|c|c|c|c|}
  Scheduler & Waiting Time & Turnaround Time & CPU \% & Throughput \\
  \hline
  \texttt{sjf} & 1112.2 & 1580.7 & 100\% & 0.002134 \\
  \hline
  \texttt{fcfs} & 2570.8 & 3039.3 & 100\% & 0.002134 \\
  \texttt{roundrobin} & 2219.6 & 3015.7 & 100\% & 0.002134 \\
  \texttt{priority} & 3097.8 & 3566.3 & 100\% & 0.002134 \\
  \texttt{priorityaging} & 3097.8 & 3566.3 & 100\% & 0.002134 \\
  \texttt{lottery} & 2590 & 3058.5 & 100\% & 0.002134 \\
  \texttt{srtf} & 1541 & 2435.7 & 100\% & 0.002134 \\
\end{tabular}
\caption{Example output from a randomized test suite. The CPU utilization and throughput are at maximum for all schedulers because this was a high-process-density test.}
\end{table}

This suggests that the SRTF scheduling strategy is the optimal method out of these options, and that this strategy should be chosen for ideal scheduling. There are still more possible strategies for implementing a scheduler, however, and other considerations to account for. For example, adding a priority system on top of SRTF may allow it to become more flexible and to perform better in all circumstances.

\end{document}