Browsing by Author "Muraya, Moses"

Now showing 1 - 9 of 9

Features Selection in Statistical Classification of High Dimensional Image Derived Maize (Zea Mays L.) Phenomic Data
(Science and Education Publishing, 2022) Gachoki, Peter; Muraya, Moses; Njoroge, Gladys
Phenotyping has advanced with the application of high throughput phenotyping techniques such automated imaging. This has led to derivation of large quantities of high dimensional phenotypic data that could not have been achieved using manual phenotyping in a single run. Hence, the need for parallel development of statistical techniques that can appropriately handle such large and/or high dimensional data set. Moreover, there is need to come up with a statistical criteria for selecting the best image derived phenotypic features that can be used as best predictors in modelling plant growth. Information on such criteria is limited. The objective of this study is to apply feature importance, feature selection with Shapley values and LASSO regression techniques to find the subset of features with the highest predictive power for subsequent use in modelling maize plant growth using highdimensional image derived phenotypic data. The study compared the statistical power of these features extraction methods by fitting an XGBoost model using the best features from each selection method. The image derived phenomic data was obtained from Leibniz Institute of Plant Genetics and Crop Plant Research, -Gatersleben, Germany. Data analysis was performed using R-statistical software. The data was subjected to data imputation using 𝑘𝑘 Nearest Neighbours technique. Features extraction was performed using feature importance, Shapley values and LASSO regression. The Shapley values extracted 25 phenotypic features, feature importance extracted 31 features and LASSO regression extracted 12 features. Of the three techniques, the feature importance criterion emerged the best feature selection technique, followed by Shapley values and LASSO regression, respectively. The study demonstrated the potential of using feature importance as a selection technique in reduction of input variables in of high dimensional growth data set.
Features Selection in Statistical Classification of High Dimensional Image Derived Maize (Zea Mays L.) Phenomic Data
(Science and Education Publishing, 2022) Gachoki, Peter; Muraya, Moses; Njoroge, Gladys
Phenotyping has advanced with the application of high throughput phenotyping techniques such automated imaging. This has led to derivation of large quantities of high dimensional phenotypic data that could not have been achieved using manual phenotyping in a single run. Hence, the need for parallel development of statistical techniques that can appropriately handle such large and/or high dimensional data set. Moreover, there is need to come up with a statistical criteria for selecting the best image derived phenotypic features that can be used as best predictors in modelling plant growth. Information on such criteria is limited. The objective of this study is to apply feature importance, feature selection with Shapley values and LASSO regression techniques to find the subset of features with the highest predictive power for subsequent use in modelling maize plant growth using highdimensional image derived phenotypic data. The study compared the statistical power of these features extraction methods by fitting an XGBoost model using the best features from each selection method. The image derived phenomic data was obtained from Leibniz Institute of Plant Genetics and Crop Plant Research, -Gatersleben, Germany. Data analysis was performed using R-statistical software. The data was subjected to data imputation using 𝑘𝑘 Nearest Neighbours technique. Features extraction was performed using feature importance, Shapley values and LASSO regression. The Shapley values extracted 25 phenotypic features, feature importance extracted 31 features and LASSO regression extracted 12 features. Of the three techniques, the feature importance criterion emerged the best feature selection technique, followed by Shapley values and LASSO regression, respectively. The study demonstrated the potential of using feature importance as a selection technique in reduction of input variables in of high dimensional growth data set.
Integration of Cervical Cancer Screening Services in the Routine Examinations Offered in the Kenyan Health Facilities: A Systematic Review
(Scientific Research Publishing Inc., 2019-05-21) Munoru, Florence; Gitonga, Lucy; Muraya, Moses
Cervical cancer is the second most common cancer among women and the leading cause of deaths among women worldwide. In Kenya, uptake of screening services is at 3.2% below the targeted of 70%. Therefore, there is need to study the factors that lead to low uptake of the screening services. One way of increasing the uptake of the screening services is its integration with other routine services, thus leading to a reduction in morbidity and mortality rates associated with the disease. The objective of this study was to review and examine the importance of integrating cervical cancer screening services in the routine examinations offered in the Kenyan health facilities. A retrospective study design was adopted by this study. The review of articles, journals, strategic plans was done from the year 2012 to 2017. Data sources included Medline, PMC, Library, Pubmed, Google scholar, cancer prevention plans and strategies. About 28 data sources were reviewed with 78.5% indicating that increased knowledge and creation of awareness on cervical cancer would greatly improve the utilization of the screening services. More than 87% of the information collected among published work in Kenya demonstrated that knowledge on importance cervical cancer screening is inadequate. The primary results of this study suggest that all women of reproductive age (WRA) should undergo cervical cancer screening as a routine service. An integration approach should be adopted, to enhance knowledge on cervical cancer and the importance of screening, causes, preventive and treatment options. The study recommends that, the Government of Kenya through the Ministry of health should include cervical cancer screening as a routine procedure for all WRA.
A NOTE ON THE BASIC REPRODUCTION NUMBER: NOVEL CORONA VIRUS (2019-nCOV)
(Chuka University, 2021) Ochwach, Jimrise O.; Okongo, Mark O.; Muraya, Moses
The basic reproductive number, R 0, is the expected number of secondary infections produced by a single individual during his or her entire infectious period, in a completely susceptible population. This concept is fundamental to the study of epidemiology and within-host pathogen dynamics. It is often used as a threshold parameter that can predicts whether an infection will spread or not. Since the outbreak of 2019 novel corona virus disease (COVID-19) in Wuhan and other cities of China the growth and spread of this disease is of a growing global concern. Many studies have been carried out and are continued to be carried to model the spread and subsequent control of the disease. In this paper, we give a brief overview of common methods of formulating R 0 from deterministic, non-structured models. Finally, we survey the recent use of R 0 in assessing the spread of novel corona virus
Predictive Modelling of Benign and Malignant Tumors Using Binary Logistic, Support Vector Machine and Extreme Gradient Boosting Models
(Science and Education Publishing, 2019-11-26) Gachoki, Peter; Mburu, Moses; Muraya, Moses
Breast cancer is the leading type of cancer among women worldwide, with about 2 million new cases and 627,000 deaths every year. The breast tumors can be malignant or benign. Medical screening can be used to detect the type of a diagnosed tumor. Alternatively, predictive modelling can also be used to predict whether a tumor is malignant or benign. However, the accuracy of the prediction algorithms is important since any incidence of false negatives may have dire consequence since a person cannot be put under medication, which can lead to death. Moreover, cases of false positives may subject an individual to unnecessary stress and medication. Therefore, this study sought to develop and validate a new predictive model based on binary logistic, support vector machine and extreme gradient boosting models in order to improve the prediction accuracy of the cancer tumors. This study used the Breast Cancer Wilcosin data set available on Kaggle. The dependent variable was whether a tumor is malignant or benign. The regressors were the tumor features such as radius, texture, area, perimeter, smoothness, compactness, concavity, concave points, symmetry and fractional dimension of the tumor. Data analysis was done using the Rstatistical software and it involved, generation of descriptive statistics, data reduction, feature selection and model fitting. Before model fitting was done, the reduced data was split into the train set and the validation set. The results showed that the binary logistic, support vector machine and extreme gradient boosting models had predictive accuracies of 96.97%, 98.01% and 97.73%. This showed an improvement compared to already existing models. The results of this study showed that support vector machine and extreme gradient boosting have better prediction power for cancer tumors compared to binary logistic. This study recommends the use of support vector machine and extreme gradient boosting in cancer tumor prediction and also recommends further investigations for other algorithms that can improve prediction
Service Delivery Factors That Influence Utilization of HIV Integrated Primary Health Care Programme in Embu Referral Hospital, Kenya
(Scientific Research Publishing Inc., 2019-09-25) Githae, Caroline N.; Matiang’i, Micah; Muraya, Moses
Globally, there are approximately 36.7 million people living with HIV. Integration of HIV treatment with primary care services improves effectiveness, efficiency and equity in service delivery. The study sought to establish service delivery factors that influenced utilization of integrated HIV and primary health care services in Embu Teaching and Referral hospital. A descriptive cross-sectional survey design was used to collect data at a specific period and point of time from a sample of 302 seropositive clients who were selected using simple random method. Data collection tool was structured and semi-structured questionnaire. The tool was reliable at Cronbach’s alpha of 0.817. SPSS version 23 was used to analyze the data. A binary logistic regression model was used to predict the relationship between service delivery and utilization of integrated services. Results: Majority of the respondents (59.6%) were aged over 35 years with majority being female (58.9%) and the married were 57.6% of the total sample. On service delivery factors, majority (94.7%) felt that their health status had improved. Action taken when clients developed side effects, 78.8% reported that the drugs were changed. Action taken following drug side effects significantly affected utilization, χ 2 = 1.305, p = 0.001, df = 1. The findings showed that waiting time significantly influenced utilization, χ 2 = 9.284, df = 1, p = 0.002. Source of information on self care also significantly influenced utilization, χ 2 = 10.689, df = 1, p = 0.001. Kind of treatment at the facility also significantly influenced utilization, χ 2 = 5.713, p = 0.048. Conclusion: significant factors that influenced utilization of integrated services were source of health care information, secondly waitingtime was another factor which influenced utilization. Majority of the respondents were satisfied with duration of time they take before they were served; they reported to take utmost 1 hour to be attended to and action taken by health care provider following side effects was another factor that influenced the utilization.
Service Delivery Factors That Influence Utilization of HIV Integrated Primary Health Care Programme in Embu Referral Hospital, Kenya
(Scientific Research Publishing Inc., 2019-09-25) Githae, Caroline N.; Matiang’i, Micah; Muraya, Moses
Globally, there are approximately 36.7 million people living with HIV. Integration of HIV treatment with primary care services improves effectiveness, efficiency and equity in service delivery. The study sought to establish service delivery factors that influenced utilization of integrated HIV and primary health care services in Embu Teaching and Referral hospital. A descriptive cross-sectional survey design was used to collect data at a specific period and point of time from a sample of 302 seropositive clients who were selected using simple random method. Data collection tool was structured and semi-structured questionnaire. The tool was reliable at Cronbach’s alpha of 0.817. SPSS version 23 was used to analyze the data. A binary logistic regression model was used to predict the relationship between service delivery and utilization of integrated services. Results: Majority of the respondents (59.6%) were aged over 35 years with majority being female (58.9%) and the married were 57.6% of the total sample. On service delivery factors, majority (94.7%) felt that their health status had improved. Action taken when clients developed side effects, 78.8% reported that the drugs were changed. Action taken following drug side effects significantly affected utilization, χ 2 = 1.305, p = 0.001, df = 1. The findings showed that waiting time significantly influenced utilization, χ 2 = 9.284, df = 1, p = 0.002. Source of information on self care also significantly influenced utilization, χ 2 = 10.689, df = 1, p = 0.001. Kind of treatment at the facility also significantly influenced utilization, χ 2 = 5.713, p = 0.048. Conclusion: significant factors that influenced utilization of integrated services were source of health care information, secondly waiting time was another factor which influenced utilization. Majority of the respondents were satisfied with duration of time they take before they were served; they reported to take utmost 1 hour to be attended to and action taken by health care provider following side effects was another factor that influenced the utilization.
Socio-Demographic, Nutritional and Adherence as Determinants of Nevirapine Plasma Concentration among HIV-1 Patients from Two Geographically Defined Regions of Kenya
(IISTE, 2020-10) Mungiria, Juster; Gitonga, Lucy; Muraya, Moses; Mwaniki, John; Ngayo, Musa Otieno
Data are skewed on the role of Socio-demographic, nutritional and adherence related factors on the influence of nevirapine plasma concentrations among Kenyan population. This study rigorously determined these three factors on nevirapine plasma concentrations among HIV patients receiving HIV treatment in two regions known for high prevalence of HIV and long duration of ART uptake.Methods: Blood samples were collected from 377 consenting HIV adult patients receiving an NVP-based first-line ART regimen. A detailed sociodemographic questionnaire was administered. NVP plasma concentration was measured by liquid chromatography - tandem mass spectrometry (LC-MS/MS). Results: The majority (59.2%) of the patients were female, 72.2% were from western Kenya (predominantly Nilotic speaking community). The patients’ mean age was 41.6 (SD ± 11.5) years and the mean duration of ART was 5.1 (SD ± 4.8) years. The median BMI of the patients was 25 kg/m2 (IQR = 22.2 - 28.7 kg/m2 ). The majority 81.2% were receiving 3TC/NVP/TDF ART regimen, 30% had changed their initial ART regimen with 54.4% reporting missing taking current ARVs. Overall NVP plasma levels ranged from 4-44207 ng/mL (median 6213 ng/mL, IQR 3097–8606.5 ng/mL). There were 105 (25.5%) participants with NVP levels of <3100 ng/mL, associated with poor viral suppression. Multivariate linear regression analysis showed region of origin (adjusted β 976, 95% CI, 183.2 to 1768.82; p = 0.016), gender (adjusted β 670, 95% CI, 293.6 to 1634.2; p = 0.047), education level (adjusted β -39.0779, 95% CI, -39.07 to 1085.7; p = 0.068), initial ART regimen type (adjusted β = -548.1, 95% C = -904.2 to -192; p =0.003) and ARV uptake in the past 30 days (adjusted β = -1109, 95% C = -2135 to -83; p =0.034) remained independently associated with NVP plasma levels.Conclusion: NVP plasma concentration is highly heterogenous among Kenyan population with a significant proportion of patients reporting levels of <3100 ng/ml, correlated with poor viral suppression. The host pharmacoecologic factors, such as gender, age, weight, education level, region of origin (ethnicity), ART regimen type and adherence, are key in influencing NVP plasma concentration. Taking these factors into consideration, HIV treatment may be personalized to achieve optimal treatment success.
Towards a Graph-Theoretic Approach to Hybrid Performance Prediction from Large-Scale Phenotypic Data.
(2015-09) Castellini, Alberto; Edlich-Muth, Christian; Muraya, Moses; Klukas, Christian; Altmann, Thomas; Selbig, Joachim
High-throughput biological data analysis has received a large amount of interest in the last decade due to pioneering technologies that are able to automatically generate large-scale datasets by performing millions of analytical tests on a daily basis. Here we present a new network-based approach to analyze a high-throughput phenomic dataset that was collected on maize inbreds and hybrids by an automated phenotyping facility. Our dataset consists of 1600 biological samples from 600 different genotypes (200 inbred and 400 hybrid lines). On each sample, 141 phenotypic traits were observed for 33 days. We apply a graph-theoretic approach to address two important problems: (i) to discover meaningful patterns in the dataset and (ii) to predict hybrid performance in terms of biomass based on automatically collected phenotypic traits. We propose a modelling framework in which the prediction problem becomes transformed into finding the shortest path in a correlation-based network. Preliminary results show small but encouraging correlations between predicted and observed biomass. Extensions of the algorithm and applications of the modelling framework to other types of biological data are discussed.