# Questions tagged [data-science]

Data science concerns extracting knowledge or insights from data, in whatever shape or form. It can contain predictive analytics and usually takes a lot of data wrangling. Do consider posting in the https://datascience.stackexchange.com/

**0**

votes

**0**answers

2 views

### How to implement fusion layer technique in pytorch?

Currently, I'm working on creating image colorization model. I want to use in it fusion layer, presented by Iizuki et al., but I have some problem with implementing it in Pytorch.
The basic idea is ...

**-1**

votes

**0**answers

9 views

### How to create a front end for collecting user data for making a tree? [on hold]

I'm making a tree based application to ask users certain questions about car insurance, based on their inputs.I have people who are working in this field who are ready to give me question inputs if ...

**0**

votes

**0**answers

19 views

### How to decide between categorical and Discrete columns

I am currently working on the Boston competition and I'm at the stage of refining my features. I've gathered, what I presumed to be, categorical and discrete columns and placed them in their ...

**0**

votes

**0**answers

14 views

### How to classify time series trends into 2 groups: “contain seasonality” and “doesn't contain seasonality”

I'm optimizing prediction model for time series data trends. Each trend may have seasonality effect or may not.
I want to classify each trend into one of the following groups: "seasonality" or "no ...

**-2**

votes

**0**answers

20 views

### Is there a difference between data and value?

When I were doing my undergraduate I learnt data as raw fact and does not have meaning - and nothing basic than it exist. Recently, in the book "Think Python, How to think like a Computer Scientist", ...

**0**

votes

**1**answer

7 views

### How can i select feature for a prediction model using caret for categorical variable?

I found caret package in R is very helpful to see the importance variables for modeling. But, i have all categorical variables in my dataset, in this case 'varImp' command returns variable importance ...

**1**

vote

**1**answer

12 views

### Adding dictionary with unique keys to DataFrame without unique keys

I am trying to do descriptive statistics of a DataFrame using GroupBy, and put those values back into the DataFrame.
My DataFrame contains a non-unique running number which identifies a person (...

**0**

votes

**0**answers

13 views

### Dataframe exported is not showing the complete data, since the str length of the column is 2400

I am trying to export a dataframe from python where the one column has str length of 2400. However, after exporting it to the csv file, the data is incomplete. Kindly help

**1**

vote

**2**answers

18 views

### filtering a Pandas dataframe by one column and getting the sum of values in another column

I have a dataframe with multiple columns(8-10) and one such column is the year column.i have another column called the arrival column. the year column consists of data from 3 years- 2018,2019 and 2020....

**-2**

votes

**1**answer

11 views

### How to make word cloud for each cluster in kmeans

"I trying to print data points in each cluster using word cloud and my data points is vectorizer data(BOW),How to print words in each cluster using word cloud..?"
I already done optimal k for k-means ...

**0**

votes

**1**answer

17 views

### I want to plot AUC wrt to depth of decision tree but with min_samples_split value changing

I want to plot the train auc and cv auc w.r.t depth change in decision tree model but min_samples_split value changing as shown in the code .
If i fix the value of min_samples_split = 5 or 10 . then ...

**-1**

votes

**0**answers

21 views

### matplolib plot not displayed after running code successfully

Able to run the below code successfully but plot not displayed as expected. Appreciate if someone can help.
def plot_rolling(df):
fig, ax = plt.subplots(3)
ax[0].plot(df.index, df.data1, label='x')
...

**0**

votes

**0**answers

7 views

### Detailed Description when hovering over a point in poinplot Using Python

I have a point plot graph with weeks as x-axis and Scores as the y-axis. When hovering over a point on that graph I want a pop up where my observations will come off. Is it possible ?
fig,ax1= plt....

**0**

votes

**1**answer

28 views

### Calculating Quantiles based on a column value?

I am trying to figure out a way in which I can calculate quantiles in pandas or python based on a column value? Also can I calculate multiple different quantiles in one output?
For example I want to ...

**-1**

votes

**0**answers

23 views

### Grace data processing and analyzing using R [on hold]

How to process and analyse Grace (Gravity recovery and climate experiment) data using R, for terrestrial hydrology monitoring.

**0**

votes

**1**answer

50 views

### Handle missing values : When 99% of the data is missing from most columns (important ones)

I am facing a dilemma with a project of mine. Few of the variables don't have enough data that means almost 99% data observations are missing.
I am thinking of couple of options -
Impute missing ...

**0**

votes

**0**answers

8 views

### Interpreting the result of the seasonal decomposition method in scipy

I have done seasonal decomposition of the data, but i cant figure out what the residuals mean, How do you interprete the residuals from a seasonal decomposition to gain helpful insights from its.

**0**

votes

**0**answers

21 views

### Binning - Equal Frequency: Boundaries and Intervalls

I'm currently learning Binning methods but I'm struggling with the equal-frequency Binning.
When learning, I had the following example which I don't understand clearly. The dataset looked like this:
...

**0**

votes

**2**answers

46 views

### Python Pandas - Concat two data frames with different number of rows and columns

I have two data frames with different row numbers and columns. Both tables has few common columns including "Customer ID". Both tables look like this with a size of 11697 rows × 15 columns and 385839 ...

**-2**

votes

**0**answers

20 views

### Is there any machine learning modelt that can predict handwritten texts accurately? If so where can i get the code for the same?

I'm in desperate need to build a machine learning model that can detect the hand written texts. Could someone please suggest me a best available model with the code to implement in my project?
I have ...

**1**

vote

**3**answers

41 views

### How to play around with JSON date format?

I have a JSON date data set and trying to calculate the time difference between two different JSON DateTime.
For example :
'2015-01-28T21:41:38.508275' - '2015-01-28T21:41:34.921589'
Please look ...

**0**

votes

**1**answer

28 views

### "How to fix: 'only integers, slices (`:`), ellipsis (`…`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices'?

I'm trying to predict heart disease of patients using liner regression algorithm in machine learning and I have this error(only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer ...

**0**

votes

**0**answers

23 views

### logistic regression:LinAlgError: Singular matrix

Performing logistic regression on data so that I could predict who has reponded to the company, but I am getting error as 'singular matrix'
import statsmodels.api as sm
logit = sm.Logit(train['...

**0**

votes

**1**answer

32 views

### R loop through the independent variables in lm function

I am having a problem with building lm function based on many independent variables in for loop. 14 different independent variables (x1, x2, x3 ..., x14) are created in each for loop and as a result ...

**0**

votes

**1**answer

27 views

### Cannot open a csv file

I have a csv file on which i need to work in my jupyter notebook ,even though i am able to view the contents in the file using the code in the picture
When i am trying to convert the data into a ...

**0**

votes

**3**answers

53 views

### How to read a CSV file every other row

how do I take from a CSV file data every 2 rows?
For example if I have a file that looks this
0 1
0 23 34
1 45 45
2 78 16
3 110 78
4 48 14
5 76 23
6 55 33
7 12 13
8 18 76
how can ...

**0**

votes

**1**answer

26 views

### Sentiment Intensity Analyzer

I am getting 4 values for each row in sid.polarity_scores(row) as i want. But for each row i want 1st value of each row to go to 1st empty list formed respectively and 2nd value of each row to go to ...

**1**

vote

**0**answers

19 views

### How do I create multiple dataframes according to the values in a column? (big data)

I have a dataframe with 30 thousand rows and 15 columns. One of the columns is called "Account" and specifies each account used. Many rows for example have the value "A" and "B" but it is impossible (...

**1**

vote

**2**answers

22 views

### mongodb - transform an array of objects (key: 'keyname', value:'value') into fields named 'keyname' with corrresponding value

The current structure of my mongodb documents is:
{
"_id": "5c9376110a32bd172c0c5a28",
"timestamp": 1553168075444,
"content": [
{
"name": "temperature_x",
"value": 2
},
{...

**-1**

votes

**1**answer

29 views

### How to design a tree to ask questions to make a decision?

I'm trying to make a program that will ask a series of questions so that it returns a suggestion at the end. How could I do this?
I tried using trees, but could not make it properly.
For example, ...

**-4**

votes

**0**answers

33 views

### What approach should I take to model forecasting problem in machine learning? [on hold]

I have a dataset which contains 4000k rows and 6 columns. The goal is to predict travel time demand of a taxi. I have read many articles regarding how to approach the problem. So, every writer tell ...

**0**

votes

**0**answers

14 views

### Does Python's datatable package support out-of-memory datasets?

datatable is a relatively fresh high performance DataFrame/data.table alternative for Python. The datatable documentation states:
It focuses on: big data support, high performance, both in-memory
...

**0**

votes

**0**answers

26 views

### Joining a table on a column that needs to be casted in order to be joined on

I am trying to join on a column that needs to be converted or cast as varchar to match that same column in a different table. But the way I am trying here I get an error of '<>' cannot be applied ...

**0**

votes

**1**answer

19 views

### How to use computational results of a CSV as search terms in Python/Pandas?

First off, in my real situation I handel much bigger data sets, but here for this minimal, reproducible example (reprex) let's assume:
I have two .csv files. They look like this:
File 1 is called "...

**-3**

votes

**0**answers

17 views

### Create a bar plot where each manufacturer is on the y axis and the h eight of the bars depict the number of cereals manufactured by them

Create a bar plot where each manufacturer is on the y axis and the h
eight of the bars
depict the number of cereals manufactured(m_name) by companies
But problem is i don't have any value
Q1 :- ...

**0**

votes

**0**answers

48 views

### reverse naive bayes (which feature is the most likely cause for response variable)

I'm working with Time series data of sales, where I have 5 products A, B, C, D, E, and total revenue is a summation of revenue of all 5 products. My goal is
1) predict what will be my total revenue ...

**0**

votes

**1**answer

19 views

### Aliasing a table in a window function?

I am trying to alias a table in a window function, but not sure what I am doing wrong as when I alias it gives error that the columns cannot be resolved
SELECT e.city,
e.time,
e.day,...

**0**

votes

**1**answer

23 views

### Making both day-first and month-first dates in a csv file day-first

I have a csv file that has a column of dates. The dates are in order of month - so January comes first, then Feb, and so on. The problem is some of the dates are in mm/dd/yyyy format and others in dd/...

**0**

votes

**1**answer

23 views

### LSTM Algorithm Produces Same Results for all Inputs

So, I am currently working on a machine learning algorithm problem pertaining to car speeds and angles, and I'm trying to improve upon some of my work. I recently got done with an XGBRegressor that ...

**-1**

votes

**1**answer

33 views

### Is there a way to take the values from one column in a dataframe and append them to different dataframe's column in pandas python

I'm working with 2 dataframes A & B of different shapes
Dataframe A has 193 rows and 33 columns
Dataframe B has 2 rows and 196 columns
I want to be able to take a column from Dataframe A "...

**-4**

votes

**0**answers

25 views

### Is there a way to suggest the object is not fitting perfectly in a video? [closed]

I have a requirement to build an application to inspect automobiles.
I am trying to build it in python which can suggest in which direction the user has to move the frame(tablet/mobile device) in ...

**0**

votes

**0**answers

35 views

### ValueError: could not convert string to float: When reading .csv dataset

Why do I keep on getting this error?
I suspect it is because get_dummies does not work for categorical data?
Should I label encode my data first?
Any help is appreciated
import pandas as pd
from ...

**1**

vote

**1**answer

52 views

### Using cosine similarity for classifying documents

I have a set of files for five different categories and most of them are not labelled correctly.Objective is to predict the correct category of the file whenever the same is uploaded.I used cosine ...

**-3**

votes

**1**answer

57 views

### How to group Column Data with like Name to find Sum, min, and max?

I'm importing a csv file that contains transposed data. The data has columns in the following format: AC1,AC2,AD1,AD2,BP1,BP2,CT1,CO1,CO2,CS1,etc
What I've been hoping to accomplish is to group ...

**0**

votes

**0**answers

33 views

### LSTM Producing Same Predictions for any Input

So, I am currently working on a machine learning algorithm problem pertaining to car speeds and angles, and I'm trying to improve upon some of my work. I recently got done with an XGBRegressor that ...

**0**

votes

**1**answer

28 views

### NLP Text classification Based on User comments

I am new to the machine learning and wanted to work on this problem statement.
I have got some of the user comments about products and based on those comments, my model should summarize and give me ...

**-2**

votes

**1**answer

37 views

### How to decode geohash using python in pandas?

I need code to decode geohash in python. There's a column which contains geohashes. I need them decoded into latitude and longitude.

**0**

votes

**1**answer

18 views

### What is the best approach to implement Time series forecasting to predict future customer orders?

I have 2 years of historical data of customers, items ordered and the numbers of orders. Based on this data, I am trying to predict the future sales at customer - item level. I tried ARIMA model ...

**0**

votes

**0**answers

8 views

### Tensorboard filter/query by metric

After running a search grid for several days/weeks, Tensorboard looks something like this:
Pretty hard to make sense of it as is. How do folks analyze their models in a case like this, where you have ...

**3**

votes

**2**answers

64 views

### Keras MLP classifier not learning

I have a data like this
there are 29 column ,out of which I have to predict winPlacePerc(extreme end of dataframe) which is between 1(high perc) to 0(low perc)
Out of 29 column 25 are numerical data ...