I am excited to be working on advancing reservoir surveillance methods in Resermine’s HawkEye platform. HawkEye combines qualitative and quantitative model workflows to help operators redistribute water continuously and obtain a higher recovery.
In addition to reduced physics models such as Capacitance Resistance Model (CRM) and INSIM we also offer statistical models (Pearson’s cross correlation & Spearman’s rank correlation) and machine learning algorithms (random forest, neural networks) to obtain interwell connectivity in a producing conventional field. The output from statistical and machine learning algorithms feed into the reduced physics models to help determine solutions to these complex nonlinear optimization problems.
What I personally like is that Hawkeye has been designed keeping the user and his decision making in mind — the results of both qualitative and quantitative models feed into each other automatically. This makes it easier for the user to obtain connectivity information to determine effective injectors and hence, a continuous water injection strategy. Figure below shows comparison of interwell connectivity calculated using Random Forest algorithm (qualitative workflow) and CRM (quantitative workflow).
Figure: Comparison of interwell connectivities of well pairs measured as feature importance in random forest model and gain values in CRM model. Adapted from our upcoming publication Yadav et al 2020 at SPE Latin America and Carribean Petroleum Engineering Conference.
One of the challenges in implementing any of these workflows – qualitative or quantitative to field data is identifying outliers that may be present in a given dataset. These could be well production rates that were just not recorded. It could also be a faulty gauge or perhaps an error in inputting those values.
I have been working on identifying which algorithm is more suitable to detect outliers that are significant within a dataset. More specifically in this case, the significance is determined by the linkage of the production rate and the corresponding injection rate. That is, if the change in production is not due to an altered injection rate, but instead, due to other interventions, the data point should be classified as not significant.
I have spent considerable time learning about different anomaly detection methods that can be used with univariate data. I am currently comparing two Python packages banpei and skyline for detecting outliers in time series data. The banpei package calculates a change point score corresponding to each production rate. This change point score can be used as a quantitative indicator of the deviance of the corresponding point from the rest of the data. Skyline is a more complex python package that is not only capable of anomaly detection, but also anomaly deflection. This means that skyline can be trained to learn what is not an anomaly, and allow for an automated approach to outlier detection.
I am currently testing the banpei package for CRM workflow with a field data set and initial results look promising. I think it may be possible to use multiple anomaly detection packages to verify outliers for a more thorough anomaly detection/deflection strategy. I am excited to be working on this outlier detection, a key component of automating data preparation for users to seamlessly use our HawkEye platform.