ENGG1811 Assignment 1: Fault Detection

Due date: 5pm, Friday 4 April (Week 7).

Late submissions will be penalised at a rate of 5% per day. This penalty applies to the initial mark you receive.

Submissions will generally not be accepted after 5pm, Wednesday 9 April (Week 8).


The use of generative AI is forbidden for this assignment. All work submitted must be your own. You may be asked to discuss your code with your tutor. Your final mark for this assignment will be based on how well you can explain your answer for the assignment and other related problems.

Version: v1.06 on 02 April 2025

Updates


Fault Detection

Automatic detection of faults can be found in many engineering systems. There are systems to automatically diagnose faults in engines, chemical plants, power generation plants, robotic arms and on on. 

This assignment is inspired by a fault detection system in a photovoltaic (PV) plant [1]. A PV plant (Wikipedia page on PV power station) is a collection of solar panels which converts solar energy into electrical power. However, sometimes the plant does not work correctly which results in, for example, less electrical power being generated than it should be. If this is the case, the plant technicians should be alerted automatically so that they can fix the faults as soon as possible.

In this assignment, you will write Python programs to perform fault detection. The aim of your program is to process data sequences of solar irradiance and power to determine whether there are faults and if so, when they have occurred.

Note that we chose the word inspired earlier because we have adapted the fault detection problem in [1] as a programming assignment by simplifying and liberally changing a few aspects of the original problem. In particular, we have made changes so that, in this assignment, you will have to use the various Python constructs that you have learnt. This means a few details of this assignment may not be realistic in engineering terms, but on the whole, you will still get a taste on how programming can be used to perform automatic fault detection.

Learning Objectives

By completing this assignment, you will learn:

  1. To apply basic programming concepts of variable declaration, assignment, conditional, functions, loops and import.
  2. To use the Python data types: list, float, int and boolean
  3. To translate an algorithm described in a natural language to a computer language.
  4. To organize programs into modules by using functions
  5. To use good program style including comments and documentation
  6. To get a practice on software development, which includes incremental development, testing and debugging.

Prohibition

You are not allowed to use numpy for this assignment. Groupwork is not allowed as this is an individual assignment.

Key Ideas Behind the Fault Detection Algorithm

The algorithm uses two sets of measurements. The first is the amount of solar irradiance which is the quantity of solar radiation falling on the solar panels. The second is the amount of electrical power generated by the solar panels; we will simply refer to that as power or power generated.

The key idea of the fault detection algorithm is to use the measured irradiance and power to determine whether a fault has occurred. For a given amount of irradiance, the algorithm uses a model (which in this case is a formula) to predict what the expected amount of power the PV plant should generate. After that, the algorithm compares the power predicted by the formula against the measured power. If the difference between these two quantities is too big then the algorithm will decide that a fault has occurred.

Requirements for Fault Detection   

This section describes the requirements on the fault detection algorithm that you will be programming in this assignment. You should be able to implement these requirements by using only the Python skills that you have learnt in the first four weeks of the lectures in this course.

We begin with describing the data that the algorithm will operate on. We will use the following Python code as an example. In the following, we will refer to the following code as the sample code. Note that the data and parameter values in the sample code are for illustration only; your code should work with any allowed input data and parameter values.

# Data: irradiance and power
# Irradiance measurements in W/m^2
irradiance_time_series = [ 240.2, 220.1, 260.2, 280.7, 256.5,
320.3, 300.7, 267.1, 321.2, 234.5,
421.7, 476.2, 321.6, 329.7, 323.4,
407.9, 456.7, 489.3, 521.5, 534.6,
543.7, 567.5]

# Generated power measured in kW
power_time_series = [31.2, 27.5, 55.5, 44.2, 58.38, 53.52]

# Parameters for the fault detection algorithm
# Data sampling times in minutes
irradiance_sampling_time = 12
power_sampling_time = 60

# Parameters of the model to predict the power generated for
# a given level of irradiance
a0 = 0.086
a1 = 3.44e-5
a2 = 3e-3
model_para = [a0, a1, a2]

# Margin in power measurment to decide whether it is a fault or not
margin = 10.0 # in kW

# Call the fault detection function
import fault_detection_main as fd
fault_status_output = fd.fault_detection_main(irradiance_time_series, power_time_series, irradiance_sampling_time,
power_sampling_time,model_para,margin)

In the sample code, there are two data series which contain, respectively, the irradiance and power measurements. Both series are Python lists whose entries are of the float type. Their variable names are irradiance_time_series and power_time_series. The irradiance is measured in Watts per square metre and power generated is measured in kilowatts.

In the sample code, the irradiance and power measurements were collected once every 12 and 60 minutes respectively. These values are stored in the variables irradiance_sampling_time and power_sampling_time.

(Remark: In [1], the irradiance measurements were taken once every 5s, which is a more realistic sampling time. We have chosen a sampling time of 12 minutes for irradiance so that the length of the list irradiance_time_series will not be exceedingly long in this example.)  

We break the algorithm down into a number of steps. The first step is to compute the average of the irradiance data. 

(Averaging the irradiance data)

Since irradiance and power were measured every 12 and 60 minutes, respectively, therefore there are 5 irradiance samples within the duration of a power sample. We assume that the first power measurement power_time_series[0] corresponds to the first 5 irradiance measurements:

irradiance_time_series[0], irradiance_time_series[1], irradiance_time_series[2], irradiance_time_series[3], irradiance_time_series[4].

Similarly. the second power measurement power_time_series[1] corresponds to the next 5 irradiance measurements:

irradiance_time_series[5], irradiance_time_series[6], irradiance_time_series[7], irradiance_time_series[8], irradiance_time_series[9].

Similarly for power_time_series[2] and power_time_series[3].

Although we can make correspondence between power_time_series[4] and the last two irradiance measurements, the correspondence is incomplete and therefore these data are not usable. Also, there are no irradiance measurements corresponding to the last power measurement, which means this power measurement is not usable.

Since we can only make (complete) correspondences between the first 4 power measurements and the first 20 irradiance measurements, so we will only use these measurements for fault detection.

We will divide the first 20 irradiance measurements into non-overlapping segments of 5 data points and compute the average of each segment. This is so that each segment of irradiance measurements corresponds to one power measurement. The table below illustrates the computation. We have added a segment number so that we can refer to them later on. Note that the segment number also corresponds to the indices in the variable power_time_series.

Segment number
Data in the segment from irradiance_time_series Average
Value
0
240.2, 220.1, 260.2, 280.7, 256.5 (240.2 + 220.1 + 260.2 + 280.7 + 256.5) / 5
251.54
1
320.3, 300.7, 267.1, 321.2, 234.5 (320.3 + 300.7 + 267.1 + 321.2 + 234.5) / 5  288.76
2
421.7, 476.2, 321.6, 329.7, 323.4 (421.7 + 476.2 + 321.6 + 329.7 + 323.4) / 5
374.52
3
407.9, 456.7, 489.3, 521.5, 534.6 (407.9 + 456.7 + 489.3 + 521.5 + 534.6) / 5
482.00

For the irradiance_time_series given in the sample code, we can summarize this averaging as returning a list whose entries are [251.54, 288.76, 374.52, 482.00]. For ease of reference, we will refer to this list by using the name  irradiance_time_series_average  later on.

Note that we rounded the numbers in the last column to 2 decimal places for display only. You should not be rounding any of your calculations in this assignment.

(Use the average irradiance and model to predict the expected power generated)

The next step is to use the average irradiance in each segment to predict the expected amount of power generated. To do that, we use a model (which in this case is a formula) to calculate the expected power from irradiance. We first define some notation:

The formula is:

P = G (a0 + a1 G + a2 log(G))

where log is the natural logarithm.

By using the values of a0, a1 and a2 from the sample code, and the average irradiance calculated earlier, we can calculate the expected power generated for each time segment:

Segment number
Average irradiance
Predicted power generated (rounded to 2 decimal points for display only)
0
251.54 27.98
1
288.76 32.61
2
374.52 43.69
3
482.00 58.38

(Compare the predicted power generated against the measured power to determine whether there is a fault - FOR ONE POWER SAMPLE)

The next step is to compare the predicted power against the measured power. We will use the algorithmic parameter margin which is defined in the sample code.

If the value of measured power minus predicted power is less than or equal to margin and bigger than or equal to -margin, then the decision is that there is no fault because the measured power is sufficiently close to the predicted power; otherwise, there is a fault. For example, by using the values of margin from the sample code, we have:

(Performing fault detection for a time-series)

The above examples show how the fault detection is to be performed for two power measurements. The following table summarizes the result of fault detection for the time series.

Segment number
Average irradiance Predicted power generated
Measured power
Measured power minus predicted power
Fault (True if it is a fault)
0
251.54 27.98 31.2 3.22
False
1
288.76 32.61 27.5 -5.11 False
2
374.52 43.69 55.5 11.81 True
3
482.00 58.38 44.2 -14.18 True

We will use a list to indicate when the faults had occurred. For the above example, we will represent the faults in the data series using [2,3] because the measurements power_time_series[2] and power_time_series[3] are determined to be faults.

In the case where there are no faults, we will indicate that by an empty list [ ].

The following figure illustrates the fault detection decision making. The solid blue dots show the predicted power generated for the average irradiance. The vertical lines are centred at the predicted power and have a height of 2*margin. The power measurements are plotted with crosses. If the cross is within the vertical line, then it is not a fault; otherwise, it is.


(Determining the false alarms)

After a fault detection algorithm has been designed, the engineers will want to check how well the algorithm is in catching the faults. One way that they can do that is to monitor the PV plant manually to determine whether actual faults have occurred. There are two possible types of error:

Let us follow on from the above example. The fault detection algorithm says the power measurements [2,3] are faults. Let us, for the sake of illustration, say that the real faults are [1,2]. In this case, the real fault at 1 is a missed detection because it is not detected by the detection algorithm. On the other hand, the fault detection algorithm claims that there is a fault at 3 but it is in fact a false alarm. If we store the results from the fault detection in a list called your_fault_list and the real faults in a list called real_fault_list. For this example,

A task for this assignment is to determine the false alarms from the given your_fault_list and real_fault_list. For this assignment, you will store the false alarms in a list. In this example, it is [3]. In the case where there are no false alarms, that should be indicated by an empty list [ ].

Note that the engineers should also be interested in missed detection, but the calculation is very similar to false alarms, so we will not ask you to do that.

Validity Checks

The description above shows how the data (irradiance_time_series, power_time_series) and algorithmic parameters (irradiance_sampling_time, power_sampling_time, model_para, margin) are used to determine when the faults occur. Note that the algorithmic parameters must be valid so that the computation can be carried out. We require that your code performs a number of validity checks before determining if there are any faults. For example, we assume that the algorithmic parameter irradiance_sampling_time is required to be a strictly positive integer. The following table states the requirements for the algorithmic parameters to be valid and what assumptions you can make when testing.

Algorithmic parameters Requirements for the parameter to be valid Assumptions you can make when testing or further explanation
irradiance_sampling_time Data type must be int and its value is strictly positive.
Hint: The python expression (type(x) is int) will return True if variable x is of the type int; False otherwise.
Examples of invalid parameter values are -5, -5.2, 5.7. You can assume that, when we test your code, irradiance_sampling_time is always a number
power_sampling_time Data type must be int and its value is strictly positive You can assume that, when we test your code, power_sampling_time is always a number
irradiance_sampling_time,
power_sampling_time
The value of power_sampling_time must be an integral multiple of the value of  irradiance_sampling_time For example, if power_sampling_time is 12 and irradiance_sampling_time is 7, then the given parameters are invalid because 12 is not an integral multiple of 7.

You can also assume that power_sampling_time and irradiance_sampling_time are given in the same unit.
model_para Must have exactly 3 entries in the list
You can assume that the given model_para is always a list and its entries are always numbers (int or float).

For example, if the given model_para has four entries, then it is invalid.
margin Must be a strictly positive number
You can assume that the given margin is always a number (int or float).

Dealing with Different Amounts of Data

The above sample code shows the situation where the overall duration of power measurements (6 samples times 60 minutes = 360 minutes) is more than that of the irradiance measurements (22 samples times 12 minutes = 264 minutes). The above example shows that we should only be using the first 4 power measurements and the first 20 irradiance measurements.

Another situation is when the overall duration of power measurements is less than that of the irradiance measurements. Consider the following code:

# Irradiance measurements in W/m^2
irradiance_time_series = [ 240.2, 220.1, 260.2, 280.7,
320.3, 300.7, 267.1, 321.2,
421.7, 476.2, 321.6, 329.7,
407.9, 456.7, 489.3, 521.5,
543.7, 567.5]

# Generated power measured in kW
power_time_series = [31.2, 27.5]

# Data sampling times in minutes
irradiance_sampling_time = 15
power_sampling_time = 60

From the sampling times, we know that 1 power measurement corresponds to 4 irradiance measurements. In this case, all the power measurements and the first 8 irradiance measurements should be used to determine the faults.

When the overall duration of power measurement is equal to that of irradiance measurements, you should use all the measurements.

Checking Whether There Is Enough Data

In order for the fault detection algorithm to run, there must be enough power and irradiance measurements. The requirements are:

You can assume that, when we test your assignment, both irradiance_time_series and power_time_series are lists, and their entries are always of the float type. You can assume that the entries in irradiance_time_series are bigger than or equal to 1.

Implementation Requirements

You need to implement the following six functions. The first five functions working together will implement the fault detection algorithm. The sixth function finds the false alarms.

The requirement is that you implement each function in a separate file. This is so that we can test them independently and we will explain this point here. We have provided template files, see Getting Started

1. def calc_average(time_series, segment_length): 
2. def power_prediction(irradiance_average_one_sample, model_para):
3. def fault_detection_one_sample(irradiance_average_one_sample, power_one_sample, model_para, margin):
4. def fault_detection_time_series(irradiance_time_series_average, power_time_series, model_para, margin):
5. def fault_detection_main(irradiance_time_series, power_time_series, irradiance_sampling_time, power_sampling_time, model_para, margin):
6. def find_false_alarms(your_fault_list, true_fault_list):

Additional requirements: In order to facilitate testing, you need to make sure that within each submitted file, you only have the code required for that function. Do not include test code in your submitted file.

Getting Started

  1. Download the ZIP file starter.zip (which contains 6 template files and 7 test files) and unzip it. This will create the directory (folder) named 'ass01'. If you are re-downloading this ZIP file, be sure to move your existing 'ass01' folder somewhere else before unzipping!
  2. First browse through all the files provided including the test files.
  3. (Incremental development) Do not try to implement too much at once, just one function at a time and test that it is working before moving on.
  4. Start implementing the first function, properly test it using the given testing file, and once you are happy, move on to the second function, and so on.
  5. Please do not use 'print' or 'input' statements. We won't be able to assess your program properly if you do. Remember, all the required values are part of the parameters, and your function needs to return the required answer. Do not 'print' your answers.

Testing

Test your functions thoroughly before submission.

You can use the provided Python programs (files like test_calc_average.py etc.) to test your functions. Please note that each file covers a limited number of test cases. We have purposely not included all the cases because we want you to think about how you should be testing your code. You are welcome to use the forum to discuss additional tests that you should use to test your code.

We will test each of your files independently. Let us give you an example. Let us assume we are testing three files: prog_a.py, prog_b.py and prog_c.py. These files contain one function each and they are: prog_a(), prog_b() and prog_c(). Let us say prog_b() calls prog_a(); and prog_c() calls both prog_b() and prog_a(). We will test your files as follows:

Submission

You need to submit the following six files. Do not submit any other files. For example, you do not need to submit your modified test files.

To submit this assignment, go to the Assignment 1 page and click the tab named "Make Submission".

Assessment Criteria

This assignment will be worth 27 marks, where 21 marks are for correctness and 6 marks are for style.

Correctness

Criteria Nominal Marks
calc_average.py 3
power_prediction.py 3
fault_detection_one_sample.py 3
fault_detection_time_series.py 3
Case 1 for fault_detection_main.py: Expected output is the string 'Corrupted input' 2
Case 2 for fault_detection_main.py: Expected output is the string 'Not enough data' 1
Case 3 for fault_detection_main.py: Expected output is a list or an empty list 3
find_false_alarms.py
3

Style

6 marks will be awarded by your tutor for style and complexity of your solution. The style assessment includes the following, in no particular order:

Assignment Originality

You are reminded that work submitted for assessment must be your own. It's OK to discuss approaches to solutions with other students, and to get help from tutors, but you must write the Python code yourself. Sophisticated software is used to identify submissions that are unreasonably similar, and marks will be reduced or removed in such cases.

Further Information

Remarks and References

Note that some aspects of this assignment are not realistic. We mentioned the sampling time of irradiance earlier. Also, we have neglected the dependence on temperature, which is in [1].

[1] R. Platon et al., Online Fault Detection in PV Systems. IEEE Transactions on Sustainable Energy, Vol. 6, No. 4, Pages 1200-1207, October 2015.  https://ieeexplore.ieee.org/document/7098398